Artificial Intelligence Based Methods And Systems For Generating

< Back

Artificial Intelligence Based Methods And Systems For Generating Merchant Related Embeddings For Learning Merchant Behavior

Abstract: Embodiments provide methods and systems for generating merchant-related embeddings for learning merchant behavior. The method includes accessing multiple sequence-related features of card-specific sequence(s) of payment transactions. The method further includes generating, via a first Machine Learning (ML) model, multiple sequence-related embeddings. The method includes generating a merchant-specific embedding for each merchant based on aggregating a relevant set of embeddings of the multiple sequence-related embeddings. The method includes generating, via a second ML model, multiple merchant velocity embeddings based on multiple merchant velocity features corresponding to each merchant. The method includes generating a set of concatenated merchant-related embeddings for each merchant based on concatenating the merchant-specific embedding with the multiple merchant velocity embeddings associated with each merchant. The method includes determining, via the second ML model, a merchant behavior pattern corresponding to each merchant based on the set of concatenated merchant-related embeddings and a predetermined collusion threshold.

Get Free WhatsApp Updates!
Notices, Deadlines & Correspondence

Patent Information

Application #

Filing Date

02 February 2024

Publication Number

32/2025

Publication Type

INA

Invention Field

COMPUTER SCIENCE

Status

Parent Application

Applicants

MASTERCARD INTERNATIONAL INCORPORATED

2000 Purchase Street, Purchase, NY 10577, United States of America

Inventors

1. Anand Vir Singh Chauhan

Anand Ice Factory, Near Ganeshbagh temple, Char Shahar Ka Naka, Behind Madan Kui Hanuman Temple, Gwalior 474003, Madhya Pradesh, India

2. Maneet Singh

J-3/95, First Floor, Rajouri Garden, New Delhi, Delhi 110027, India

3. Ushmita Pareek

D-16, Staff Colony, Century Yarn Satarati, Khargone 451660, Madhya Pradesh, India

4. Tarun Somavarapu

C/O Syed Ismail, Plot# 105, Mallikarjuna Nagar, Hyderabad 500068, Telangana, India

Specification

Description: The present disclosure relates to artificial intelligence-based processing systems and, more particularly, to electronic methods and complex processing systems for generating merchant-related embeddings for learning merchant behavior.
BACKGROUND
With the advent of technology, digital payment transactions, i.e., cashless payment transactions have been on the rise. As a result, the risk of data breaches, identity theft, and financial fraud has grown exponentially as well. One such form of prevalent fraud is known as merchant fraud. As used herein, the term ‘merchant fraud’ refers to a type of fraud in which an individual poses as a legitimate business to deceive its users and make illegitimate profits. Herein, the accounts of the merchants are illegally managed and operated by criminals or fraudsters. Apparently, merchant fraud can be of several types such as bust-out-fraud, transaction laundering, identity swap, business remodeling, etc. As may be understood, merchant fraud has become a growing problem that affects businesses of all types and sizes. For example, in e-commerce, it is difficult to verify the payment transactions since these transactions are card-not-present (CNP) transactions, as opposed to the card-present (CP) transactions performed at brick-and-mortar merchant stores. In multiple instances, it is seen that fraudsters may be more likely to commit fraud when CNP transactions take place.
As may be understood, merchant fraud not only affects businesses but also poses a significant financial risk to the users (otherwise, also known as cardholders) of the businesses that are targeted by merchant fraud. This further damages the reputation of the businesses, and results in financial losses and chargebacks for these businesses. Additionally, merchants may also be liable to face legal consequences due to merchant fraud. Further, payment processors (such as Mastercard®) may also face financial losses due to chargeback caused due to merchant fraud which will in turn cause them to charge higher payment processing fees from innocent merchants that are operating in a high-risk industry to compensate for the increased merchant fraud risk.
To that end, various techniques have been developed to prevent or curb merchant fraud. For instance, various Artificial Intelligence (AI) or Machine Learning (ML) based models have been developed to detect or predict merchant fraud. However, these conventional AI or ML models suffer from various drawbacks which lead to poor real-world performance while detecting merchant fraud. As may be understood, such conventional AI or ML models are generally task-specific in nature and are trained on merchant-related data due to which while they are used for detecting merchant fraud, these AI or ML models consider only a particular type of data, i.e., merchant-related data. To that end, the learning regarding the behavior of a merchant in such conventional models is limited or poor, which causes the predictions of these models to be full of false positives. In other words, the accuracy, precision, and efficiency of these conventional AI or ML models are quite poor.
Thus, there exists a need for technical solutions such as methods and systems for generating merchant-related embeddings for learning merchant behavior to achieve enhanced model performance, accuracy, precision, and efficiency, thereby overcoming the aforementioned technical drawbacks.
SUMMARY
Various embodiments of the present disclosure provide methods and systems for generating merchant-related embeddings for learning merchant behavior.
In an embodiment, a computer-implemented method for generating merchant-related embeddings for learning merchant behavior is disclosed. The computer-implemented method performed by a server system includes accessing a plurality of sequence-related features from a database associated with the server system. The plurality of sequence-related features corresponds to one or more card-specific sequences of payment transactions performed by a plurality of cardholders at a plurality of merchants. Herein, each card-specific sequence of the one or more card-specific sequences corresponds to a particular cardholder from the plurality of cardholders. The computer-implemented method further includes generating, via a first Machine Learning (ML) model, a plurality of sequence-related embeddings based, at least in part, on the plurality of sequence-related features. Herein, each sequence-related embedding of the plurality of sequence-related embeddings corresponds to each card-specific sequence. Further, the computer-implemented method includes generating a merchant-specific embedding for each merchant of the plurality of merchants based, at least in part, on aggregating a relevant set of embeddings of the plurality of sequence-related embeddings. Furthermore, the computer-implemented method includes generating, via a second ML model, a plurality of merchant velocity embeddings based, at least in part, on a plurality of merchant velocity features corresponding to each merchant of the plurality of merchants accessed from the database. Herein, each merchant velocity embedding corresponds to each payment transaction performed at the corresponding merchant. Moreover, the computer-implemented method includes generating a set of concatenated merchant-related embeddings for each merchant of the plurality of merchants based, at least in part, on concatenating the merchant-specific embedding and the plurality of merchant velocity embeddings associated with each merchant. The computer-implemented method includes determining, via the second ML model, a merchant behavior pattern corresponding to each merchant based, at least in part, on the set of concatenated merchant-related embeddings and a predetermined collusion threshold.
In another embodiment, a server system is disclosed. The server system includes a communication interface and a memory including executable instructions. The server system also includes a processor communicably coupled to the memory. The processor is configured to execute the instructions to cause the server system, at least in part, to access a plurality of sequence-related features from a database associated with the server system. The plurality of sequence-related features corresponds to one or more card-specific sequences of payment transactions performed by a plurality of cardholders at a plurality of merchants. Herein, each card-specific sequence of the one or more card-specific sequences corresponds to a particular cardholder from the plurality of cardholders. The server system is further caused to generate, via a first Machine Learning (ML) model, a plurality of sequence-related embeddings based, at least in part, on the plurality of sequence-related features. Herein, each sequence-related embedding of the plurality of sequence-related embeddings corresponds to each card-specific sequence. Further, the server system is caused to generate a merchant-specific embedding for each merchant of the plurality of merchants based, at least in part, on aggregating a relevant set of embeddings of the plurality of sequence-related embeddings. Furthermore, the server system is caused to generate, via a second ML model, a plurality of merchant velocity embeddings based, at least in part, on a plurality of merchant velocity features corresponding to each merchant of the plurality of merchants accessed from the database. Herein, each merchant velocity embedding corresponds to each payment transaction performed at the corresponding merchant. Moreover, the server system is caused to generate a set of concatenated merchant-related embeddings for each merchant of the plurality of merchants based, at least in part, on concatenating the merchant-specific embedding and the plurality of merchant velocity embeddings associated with each merchant. The server system is caused to determine via the second ML model, a merchant behavior pattern corresponding to each merchant based, at least in part, on the set of concatenated merchant-related embeddings and a predetermined collusion threshold.
In yet another embodiment, a non-transitory computer-readable storage medium is disclosed. The non-transitory computer-readable storage medium includes computer-executable instructions that, when executed by at least a processor of a server system, cause the server system to perform a method. The method includes accessing a plurality of sequence-related features from a database associated with the server system. The plurality of sequence-related features corresponds to one or more card-specific sequences of payment transactions performed by a plurality of cardholders at a plurality of merchants. Herein, each card-specific sequence of the one or more card-specific sequences corresponds to a particular cardholder from the plurality of cardholders. The method further includes generating, via a first Machine Learning (ML) model, a plurality of sequence-related embeddings based, at least in part, on the plurality of sequence-related features. Herein, each sequence-related embedding of the plurality of sequence-related embeddings corresponds to each card-specific sequence. Further, the method includes generating a merchant-specific embedding for each merchant of the plurality of merchants based, at least in part, on aggregating a relevant set of embeddings of the plurality of sequence-related embeddings. Furthermore, the method includes generating, via a second ML model, a plurality of merchant velocity embeddings based, at least in part, on a plurality of merchant velocity features corresponding to each merchant of the plurality of merchants accessed from the database. Herein, each merchant velocity embedding corresponds to each payment transaction performed at the corresponding merchant. Moreover, the method includes generating a set of concatenated merchant-related embeddings for each merchant of the plurality of merchants based, at least in part, on concatenating the merchant-specific embedding and the plurality of merchant velocity embeddings associated with each merchant. The method includes determining, via the second ML model, a merchant behavior pattern corresponding to each merchant of the plurality of merchants based, at least in part, on the set of concatenated merchant-related embeddings and a predetermined collusion threshold.

BRIEF DESCRIPTION OF THE FIGURES
For a more complete understanding of example embodiments of the present technology, reference is now made to the following descriptions taken in connection with the accompanying drawings in which:
FIG. 1 illustrates a schematic representation of an environment related to at least some example embodiments of the present disclosure;
FIG. 2 illustrates a simplified block diagram of a server system, in accordance with an embodiment of the present disclosure;
FIG. 3 illustrates a block diagram representation depicting a detailed flow of performing fraud detection from a set of concatenated merchant-related embeddings, in accordance with an embodiment of the present disclosure;
FIG. 4A illustrates a process flow diagram depicting a process of generating a merchant-specific embedding from a plurality of sequence-related embeddings, in accordance with an embodiment of the present disclosure;
FIG. 4B illustrates a process flow diagram depicting a process of generating a set of concatenated merchant-related embeddings for each merchant, in accordance with an embodiment of the present disclosure;
FIG. 4C illustrates a process flow diagram depicting a process of determining a merchant behavior pattern from the set of concatenated merchant-related embeddings, in accordance with an embodiment of the present disclosure;
FIG. 5 illustrates a process flow diagram depicting a process of labeling a future payment transaction, in accordance with an embodiment of the present disclosure;
FIG. 6 illustrates a process flow diagram depicting a process of training and generating a first Machine Learning (ML) model, in accordance with an embodiment of the present disclosure;
FIG. 7 illustrates a process flow diagram depicting a process of training and generating a second ML model, in accordance with an embodiment of the present disclosure;
FIG. 8 illustrates a flow diagram depicting a method for generating merchant-related embeddings for learning merchant behavior, in accordance with an embodiment of the present disclosure; and
FIG. 9 illustrates a simplified block diagram of a payment server, in accordance with an embodiment of the present disclosure.
The drawings referred to in this description are not to be understood as being drawn to scale except if specifically noted, and such drawings are only exemplary in nature.

DETAILED DESCRIPTION
In the following description, for purposes of explanation, numerous specific details are set forth in order to provide a thorough understanding of the present disclosure. It will be apparent, however, to one skilled in the art that the present disclosure can be practiced without these specific details. Descriptions of well-known components and processing techniques are omitted so as to not unnecessarily obscure the embodiments herein. The examples used herein are intended merely to facilitate an understanding of ways in which the embodiments herein may be practiced and to further enable those of skill in the art to practice the embodiments herein. Accordingly, the examples should not be construed as limiting the scope of the embodiments herein.
Reference in this specification to “one embodiment” or “an embodiment” means that a particular feature, structure, or characteristic described in connection with the embodiment is included in at least one embodiment of the present disclosure. The appearances of the phrase “in an embodiment” in various places in the specification are not necessarily all referring to the same embodiment, nor are separate or alternative embodiments mutually exclusive of other embodiments. Moreover, various features are described which may be exhibited by some embodiments and not by others. Similarly, various requirements are described which may be requirements for some embodiments but not for other embodiments.
Moreover, although the following description contains many specifics for the purposes of illustration, anyone skilled in the art will appreciate that many variations and/or alterations to said details are within the scope of the present disclosure. Similarly, although many of the features of the present disclosure are described in terms of each other, or in conjunction with each other, one skilled in the art will appreciate that many of these features can be provided independently of other features. Accordingly, this description of the present disclosure is set forth without any loss of generality to, and without imposing limitations upon, the present disclosure.
Embodiments of the present disclosure may be embodied as an apparatus, a system, a method, or a computer program product. Accordingly, embodiments of the present disclosure may take the form of an entire hardware embodiment, an entire software embodiment (including firmware, resident software, micro-code, etc.), or an embodiment combining software and hardware aspects that may all generally be referred to herein as a “circuit”, “engine”, “module”, or “system”. Furthermore, aspects of the present disclosure may take the form of a computer program product embodied in one or more computer-readable storage media having computer-readable program code embodied thereon.
The terms “payment transaction”, “financial transaction”, “e-commerce transactions”, “digital transaction”, and “transaction” are used interchangeably throughout the description and refer to a transaction of payment of a certain amount being initiated by the cardholder. More specifically, refers to electronic financial transactions including, for example, online payment, payment at a terminal (e.g., point of sale (POS) terminal, ATM, self-service kiosks, and the like. It may be understood that online payments can be performed on Electronic-commerce (E-commerce) platforms that are either provided on web browsers or by merchants in the form of mobile applications. For instance, by inputting various payment card details on a payment gateway connected to the merchant on the merchant application, a cardholder may carry out a payment transaction on a merchant’s application installed on their smartphone to buy items or pay for services.
The terms “account holder”, “user”, “cardholder”, “consumer”, and “buyer” are used interchangeably throughout the description and refer to a person who has a payment account or at least one payment card (e.g., credit card, debit card, etc.) associated with the payment account, that will be used by a merchant to perform a payment transaction. The payment account may be opened via an issuing bank or an issuer server.
The term “merchant”, used throughout the description generally refers to a seller, a retailer, a purchase location, an organization, or any other entity that is in the business of selling goods or providing services, and it can refer to either a single business location or a chain of business locations of the same entity.
The terms “merchant application” and “merchant platform” used interchangeably, throughout the description, may refer to a software application associated with a particular merchant or may be associated with a number of different merchants. The merchant application may be capable of identifying a particular merchant (or multiple merchants) that is participating in a transaction with a cardholder. To that end, the merchant application is configured to allow a cardholder to perform a payment transaction with a merchant in exchange for goods or services.
The term “issuer”, used throughout the description, refers to a financial institution normally called an “issuer bank” or “issuing bank” in which an individual or an institution may have an account. The issuer also issues a payment card, such as a credit card or a debit card, etc. Further, the issuer may also facilitate online banking services such as electronic money transfer, bill payment, etc., to the account holders through a server called “issuer server” throughout the description.
Further, the term “acquirer”, is a financial institution (e.g., a bank) that processes financial transactions for merchants. In other words, this can be an institution that facilitates the processing of payment transactions for physical stores, merchants, or institutions that own platforms that make either online purchases or purchases made via software applications possible (e.g., the shopping cart platform providers and the in-app payment processing providers). The terms “acquirer”, “acquiring bank”, “acquiring bank” or “acquirer server” will be used interchangeably herein.
The term “payment account” used throughout the description refers to a financial account that is used to fund a financial transaction interchangeably referred to as “payment transaction” or “transaction”). Examples of the financial account include, but are not limited to, a savings account, a credit account, a checking account, and a virtual payment account. The financial account may be associated with an entity such as an individual person, a family, a commercial entity, a company, a corporation, a governmental entity, a non-profit organization, and the like. In some scenarios, the financial account may be a virtual or temporary payment account that can be mapped or linked to a primary financial account, such as those accounts managed by payment wallet service providers, and the like.
The terms “payment network” and “card network” are used interchangeably throughout the description and refer to a network or collection of systems used for the transfer of funds through the use of cash substitutes. Payment networks may use a variety of different protocols and procedures in order to process the transfer of money for various types of transactions. Payment networks are companies that connect an issuing bank with an acquiring bank to facilitate online payment. Transactions that may be performed via a payment network may include product or service purchases, credit purchases, debit transactions, fund transfers, account withdrawals, etc. Payment networks may be configured to perform transactions via cash substitutes that may include payment cards, letters of credit checks, financial accounts, etc. Examples of networks or systems configured to perform as payment networks include those operated by such as Mastercard®.
The term “payment card”, used throughout the description, refers to a physical or virtual card that may or may not be linked with a financial or payment account that may be presented to a merchant or any such facility to fund a financial transaction via the associated payment account. Examples of the payment card include, but are not limited to, debit cards, credit cards, prepaid cards, virtual payment numbers, virtual card numbers, forex cards, charge cards, e-wallet cards, and stored-value cards. A payment card may be a physical card that may be presented to the merchant for funding the payment. Alternatively, or additionally, the payment card may be embodied in the form of data stored in a user device, where the data is associated with a payment account such that the data can be used to process the financial transaction between the payment account and a merchant’s financial account.
OVERVIEW
Various embodiments of the present disclosure provide methods, systems electronic devices, and computer program products for generating merchant-related embeddings for learning merchant behavior to achieve enhanced model performance, accuracy, precision, and efficiency. In an embodiment, the present disclosure describes a server system for generating merchant-related embeddings, where the server system may be embodied within a payment server associated with a payment network. In one embodiment, the server system may access a cardholder-related dataset from a database associated with the server system. In an embodiment, the cardholder-related dataset includes historical information corresponding to payment transactions performed by a plurality of cardholders at a plurality of merchants using a plurality of payment cards.
The server system further generates transaction sequence data corresponding to each cardholder based, at least in part, on the cardholder-related dataset and a time stamp associated with each payment transaction. Herein, the transaction sequence data may include one or more card-specific sequences of payment transactions corresponding to each cardholder. Further, each card-specific sequence corresponds to a sequence of payment transactions linked with a time of occurrence and a predetermined label and performed by a particular cardholder at the plurality of merchants.
Moreover, the server system generates a plurality of sequence-related features corresponding to each cardholder for each card-specific sequence based, at least in part, on the transaction sequence data and stores in the database. Further, the server system accesses the plurality of sequence-related features from the database. In an embodiment, the plurality of sequence-related features may include at least: time stamps of payment transactions, a location flag, a payment card status flag, a transaction status flag, a sequence type, label information, and the like. The sequence-related features may correspond to the one or more card-specific sequences of payment transactions performed by the plurality of cardholders at the plurality of merchants. Herein, each card-specific sequence corresponds to a particular cardholder from the plurality of cardholders.
Upon accessing the plurality of sequence-related features, the server system may further generate, via a first ML model, a plurality of sequence-related embeddings based, at least in part, on the plurality of sequence-related features. Each sequence-related embedding corresponds to each card-specific sequence. In some embodiments, the server system may be configured to generate the first ML model by performing a first set of operations iteratively until a performance of the first ML model converges to first predefined criteria. In a specific embodiment, the first ML model may be a deviation-based Marked Temporal Point Process (MTPP) model.
In a specific embodiment, the first set of operations may include: (1) initializing the first ML model based on one or more first model parameters, (2) predicting, via the first ML model, an occurrence of a future payment transaction based, at least in part, on the plurality of sequence-related embeddings, (3) computing a deviation between the predicted occurrence and an actual occurrence of the future payment transaction, and (4) optimizing the one or more first model parameters. Then the process repeats with optimized first model parameters until the performance of the first ML model converges to the first predefined criteria.
In a specific implementation, it may be noted that the server system may predict, via the first ML model, an occurrence of a future payment transaction at each merchant based, at least in part, on the plurality of sequence-related embeddings corresponding to each card-specific sequence. Further, the server system may identify an actual occurrence of the future payment transaction at each merchant based, at least in part, on historical information corresponding to payment transactions performed between the plurality of cardholders and the plurality of merchants. Herein, it may be noted that the historical information is accessed from the database. Furthermore, the server system may determine a deviation between the actual occurrence and a predicted occurrence of the future payment transaction at each merchant of the plurality of merchants based, at least in part, on comparing the actual occurrence and the prediction of the first ML model.
Moreover, in an embodiment, the server system may label the future payment transaction as a fraudulent transaction based, at least in part, on the deviation being at least equal to a predefined fraudulent threshold. In another embodiment, the server system may label the future payment transaction as a non-fraudulent transaction based, at least in part, on the deviation being less than the predefined fraudulent threshold.
In another embodiment, the server system may generate a merchant-specific embedding for each merchant based, at least in part, on aggregating a relevant set of embeddings of the plurality of sequence-related embeddings. In a non-limiting implementation, the server system initially determines the relevant set of embeddings for each merchant based, at least in part, on the plurality of sequence-related embeddings. Herein, the relevant set of embeddings indicates one or more sequence-related embeddings of the plurality of sequence-related embeddings corresponding to the corresponding merchant. Later, the server system may aggregate the relevant set of embeddings for each merchant using a predefined aggregation technique and generate the merchant-specific embedding for each merchant based on the aggregation.
In an embodiment, the server system further accesses a merchant-related dataset from the database. Herein, the merchant-related dataset may include historical information corresponding to payment transactions performed at each merchant. Furthermore, the server system may generate a plurality of merchant velocity features corresponding to each merchant based, at least in part, on the merchant-related dataset and stored in the database.
In an embodiment, the server system may access the plurality of merchant velocity features from the database and generate a plurality of merchant velocity embeddings for each merchant. In a specific embodiment, the server system may generate the plurality of merchant velocity embeddings using a second ML model. Herein, each merchant velocity embedding corresponds to each payment transaction performed at the corresponding merchant.
In some embodiments, the server system may be configured to generate the second ML model by performing a second set of operations iteratively until the second ML model converges to second predefined criteria. In an embodiment, the second ML model may be a Gradient Boosting Machine (GBM) model.
In a specific embodiment, the second set of operations may include: (1) initializing the second ML model based on one or more second model parameters, (2) generating a decision tree, (3) computing optimized loss function and other optimization parameters, (4) generating an updated decision tree, (5) assigning weights to each of the decision trees, (6) obtaining an ensemble of the decision trees, and (7) optimizing the one or more second model parameters. It may be noted that the process repeats with optimized second model parameters until the performance of the second ML model converges to the second predefined criteria. The server system may further be configured to generate a set of concatenated merchant-related embeddings for each merchant based, at least in part, on concatenating the merchant-specific embedding and the plurality of merchant velocity embeddings associated with each merchant.
Upon obtaining the set of concatenated merchant-related embeddings, the server system may determine, via the second ML model, a merchant behavior pattern corresponding to each merchant based, at least in part, on the set of concatenated merchant-related embeddings and a predetermined collusion threshold. In a specific embodiment, the server system may identify the merchant behavior pattern corresponding to each merchant based, at least in part, on the set of concatenated merchant-related embeddings. Further, the server system may generate and assign a collusion score to at least one merchant of the plurality of merchants based, at least in part, on comparing the identified merchant behavior pattern with a predicted merchant behavior pattern. Furthermore, in one embodiment, the server system may label the at least one merchant as a collusive merchant based, at least in part, the collusion score being at least equal to the predetermined collusion threshold. In an alternative embodiment, the server system labels the at least one merchant of the plurality of merchants as a non-collusive merchant based, at least in part, on the collusion score being less than the predetermined collusion threshold.
Various embodiments of the present disclosure offer multiple advantages and technical effects. For instance, the present disclosure provides a novel process of generating concatenated merchant-related embeddings that can be used to learn merchant behavior. Considering that, several downstream merchant-related tasks can be performed at the acquirer’s end, e.g., fraud detection task with improved accuracy, precision, and performance of the model in comparison to conventional methods. It may be noted that merchant-related embeddings generated using the first ML model, which are used for training the second ML model for fraud detection are represented in sequence-related embeddings associated with cardholders that transacted at that particular merchant. Thus, a novel architecture of a combined model including the first ML model and the second ML model is proposed in the present disclosure.
Further, when the merchant-related embeddings also include the features related to the sequence of payment transactions performed by cardholders at the corresponding merchant, it covers amore wider bandwidth of features in comparison to conventional models. This reduces the false positives and improves performance metrics such as the Area Under the Precision-Recall curve (AUC-PR) by a significant amount. Therefore, the overall performance of the combined ML model improves. Moreover, it may be noted that the architecture of the combined ML model is deployment-friendly and efficient at the same time.
Various example embodiments of the present disclosure are described hereinafter with reference to FIGS. 1 to 9.
FIG. 1 illustrates a schematic representation of an environment 100 related to at least some example embodiments of the present disclosure. Although the environment 100 is presented in one arrangement, other embodiments may include the parts of the environment 100 (or other parts) arranged otherwise depending on, for example, generating a merchant-specific embedding from a plurality of sequence-related embeddings, generating a set of concatenated merchant-related embeddings, and determining a merchant behavior pattern from the set of concatenated merchant-related embeddings for performing several downstream merchant-related task such as fraud detection.
The environment 100 generally includes a plurality of entities such as a server system 102, a plurality of cardholders 104(1), 104(2), … 104(N), where ‘N’ is a non-zero natural number (collectively referred to hereinafter as a ‘plurality of cardholders 104’ or simply ‘cardholders 104’), a plurality of merchants 106(1), 106(2), … 106(N) where ‘N’ is a non-zero natural number (collectively referred to hereinafter as a ‘plurality of merchants 106’ or simply ‘merchants 106’), a plurality of issuer servers 108(1), 108(2), … 108(N), where ‘N’ is a non-zero natural number (collectively referred to hereinafter as a ‘plurality of issuer servers 108’ or simply ‘issuer servers 108’), a plurality of acquirer servers 110(1), 110(2), … 110(N), where ‘N’ is a non-zero natural number (collectively referred to hereinafter as a ‘plurality of acquirer servers 110’ or simply ‘acquirer servers 110’), a payment network 112 including a payment server 114, and a database 116 each coupled to, and in communication with (and/or with access to) a network 118. The network 118 may include, without limitation, a Light Fidelity (Li-Fi) network, a Local Area Network (LAN), a Wide Area Network (WAN), a Metropolitan Area Network (MAN), a satellite network, the Internet, a fiber optic network, a coaxial cable network, an infrared (IR) network, a Radio Frequency (RF) network, a virtual network, and/or another suitable public and/or private network capable of supporting communication among two or more of the parts or users illustrated in FIG. 1, or any combination thereof.
Various entities in the environment 100 may connect to the network 118 in accordance with various wired and wireless communication protocols, such as Transmission Control Protocol and Internet Protocol (TCP/IP), User Datagram Protocol (UDP), 2nd Generation (2G), 3rd Generation (3G), 4th Generation (4G), 5th Generation (5G) communication protocols, Long Term Evolution (LTE) communication protocols, New Radio (NR) communication protocol, any future communication protocol, or any combination thereof. In some instances, the network 118 may utilize a secure protocol (e.g., Hypertext Transfer Protocol (HTTP), Secure Socket Lock (SSL), and/or any other protocol, or set of protocols for communicating with the various entities depicted in FIG. 1.
In an embodiment, the cardholder (e.g., the cardholder 104(1)) may be any individual, representative of a corporate entity, a non-profit organization, or any other person who is presenting payment account details during an electronic payment transaction. The cardholder (e.g., the cardholder 104(1)) may have a payment account issued by an issuing bank (not shown in figures) associated with an issuer server (e.g., the issuer server 108(1)) and may be provided with a payment card with financial or other account information encoded onto the payment card such that the cardholder (i.e., the cardholder 104(1)) may use the payment card to initiate and complete a payment transaction using a bank account at the issuing bank.
In another embodiment, the cardholders 104 may use their corresponding electronic devices (not shown in figures) to access a mobile application or a website associated with the issuing bank, or any third-party payment application to perform a payment transaction. In various non-limiting examples, the electronic devices may refer to any electronic devices such as, but not limited to, Personal Computers (PCs), tablet devices, smart wearable devices, Personal Digital Assistants (PDAs), voice-activated assistants, Virtual Reality (VR) devices, smartphones, laptops, and the like.
In an embodiment, the merchants 106 may include retail shops, restaurants, supermarkets or establishments, government and/or private agencies, or any such places equipped with POS terminals, where cardholders 104 visit to perform financial transactions in exchange for any goods and/or services or any financial transactions.
In one scenario, the cardholders 104 may use their corresponding payment accounts to conduct payment transactions with the merchants 106. Moreover, it may be noted that each of the cardholders 104 may use their corresponding payment cards differently or make the payment transaction using different means of payment such as net banking, Unified Payments Interface (UPI) payment, cheque, etc. For instance, the cardholder 104(1) may enter payment account details on an electronic device (not shown) associated with the cardholder 104(1) to perform an online payment transaction. In another example, the cardholder 104(2) may utilize a payment card to perform an offline payment transaction. It is understood that generally, the term “payment transaction” refers to an agreement that is carried out between a buyer and a seller to exchange goods or services in exchange for assets in the form of a payment (e.g., cash, fiat-currency, digital asset, cryptographic currency, coins, tokens, etc.). For example, the cardholder 104(3) may enter details of the payment card to transfer funds in the form of fiat currency on an e-commerce platform to buy goods. In another instance, each cardholder of the plurality of cardholders 104 (e.g., the cardholder 104(1)) may transact at any merchant from the plurality of merchants 106 (e.g., the merchant 106(1)).
In one embodiment, the cardholders 104 may be associated with financial institutions such as issuing banks who are associated with the issuer servers 108. To that end, it is noted that the terms “issuer bank”, “issuing bank” or simply “issuer”, and “issuer servers”, hereinafter may be used interchangeably. It may be understood that a cardholder (e.g., the cardholder 104(1)) may have the payment account with the issuing bank, (that may issue a payment card, such as a credit card or a debit card to the cardholders 104). Further, the issuing banks provide microfinance banking services (e.g., payment transactions using credit/debit cards) for processing electronic payment transactions, to the cardholder (e.g., the cardholder 104(1)).
In an embodiment, the merchants are generally associated with financial institutions such as acquiring banks who are associated with the acquirer servers 110. To that end, it is noted that the terms “acquirer”, “acquiring bank”, or “acquirer server”, and “acquirer servers” will be used interchangeably hereinafter. Herein, the acquiring bank can be an institution that facilitates the processing of payment transactions for physical stores, merchants, or institutions that own platforms that make either online purchases or purchases made via software applications possible.
As described earlier, the conventional approaches for detecting fraud, more specifically, merchant fraud detection, have several drawbacks. Typically, it should be noted that drawbacks of any conventional AI or ML model are associated with the fact that they are task-specific in nature, and when used for merchant fraud detection, they consider only a merchant-related dataset that is available at the acquirer’s end. Considering that the conventional AI or ML models are dependent on a limited dataset and only one type of dataset, the learning of regarding the behavior of a merchant using the conventional AI or ML models is limited or poor, which causes the predictions of these models to be full of false positives. In other words, the accuracy, precision, and efficiency of these conventional AI or ML models are quite poor.
Therefore, a technical solution is required to facilitate the use of embeddings generated by one AI or ML model into another AI or ML model for performing a particular downstream task. More specifically, an approach is needed to train a model to generate merchant-related embeddings based on not only a merchant-related dataset, but also a cardholder-related dataset. Herein, it may be noted that embeddings corresponding to the merchant-related dataset may be available to the model, however, embeddings corresponding to the cardholder-related dataset are obtained from another model. To that end, a better learning of the behavior of a merchant may be obtained as it is dependent on more than one type of dataset i.e., the cardholder-related dataset as well as the merchant-related dataset. Further, the merchant behavior that is learned from the merchant-related embeddings, can be used for detecting merchant fraud at the acquirer’s end. As may be understood, features are essentially the attributes or characteristics extracted from the data that are used to make predictions or classifications. Similarly, embeddings are rich representations of the features that capture intricate patterns, relationships, and contextual information.
The above-mentioned technical problems, among other problems, are addressed by one or more embodiments implemented by the server system 102 and methods thereof provided in the present disclosure. In an embodiment, it may be noted that the methods and systems proposed in the present disclosure can be used for performing tasks other than fraud detection as well, without limiting the scope of the invention. However, for the sake of explanation, analysis, and model performance comparison, the example of detecting merchant fraud is considered in the present disclosure, more specifically collusive merchant detection is considered. To that note, it should be understood that the present disclosure is applicable for various other downstream applications such as merchant classification, merchant performance analysis, merchant sales analysis, and the like.
In one embodiment, the server system 102 is configured to facilitate payment processors to generate a set of concatenated merchant-related embeddings (otherwise, also referred to as merchant-related embeddings) by aggregating and concatenating transaction sequence-related embeddings (otherwise, also referred to as sequence-related embeddings) for each merchant with the merchant velocity embeddings for the corresponding merchant, understand behavior patterns of merchants, detect collusive merchants based on the set of concatenated merchant-related embeddings, and the like. In some embodiments, the server system 102 may be deployed as a standalone server or may be implemented in the cloud as software as a service (SaaS). In another embodiment, the server system 102 is configured to facilitate an instance of a software application running on an electronic device used by the payment processors to perform certain operations, so that the server system 102 can generate the set of concatenated merchant-related embeddings by aggregating and concatenating the sequence-related embeddings for each merchant and the merchant velocity embeddings for the corresponding merchant. In one non-limiting implementation, the server system 102 may facilitate the acquirers 110 with AI or ML models that can generate the merchant-related embeddings for learning the merchant behavior to achieve enhanced model performance, accuracy, precision, and efficiency.
As may be understood, in an embodiment, the database 116 associated with server system 102 is configured to store a cardholder-related dataset and a merchant-related dataset. In a non-limiting example, the cardholder-related dataset may include historical information corresponding to various or plurality of payment transactions performed by the plurality of cardholders 104 at the plurality of merchants 106 using a plurality of payment cards (not shown in FIG. 1) in the payment ecosystem. In another non-limiting example, the merchant-related dataset may include historical information corresponding to various or plurality of payment transactions performed at each merchant of the plurality of merchants 106 in the payment ecosystem.
Further, the server system 102, initially, accesses the cardholder-related dataset, generates a plurality of sequence-related features such as sequence-related features 120, and stores the sequence-related features 120 in the database 116. In another embodiment, the server system 102 accesses the merchant-related dataset from the database 116, generates a plurality of merchant velocity features such as merchant velocity features 122, and stores the merchant velocity features 122 in the database 116.
Furthermore, in an embodiment, the database 116 may also store one or more AI or ML models such as a first ML model 124, and a second ML model 126 that may be used by the server system 102 for further processing the sequence-related features 120 and the merchant velocity features 122.
In other various examples, the database 116 may also include multifarious data, for example, social media data, Know Your Customer (KYC) data, payment data, trade data, employee data, Anti Money Laundering (AML) data, market abuse data, Foreign Account Tax Compliance Act (FATCA) data, and fraudulent payment transaction data. In addition, the database 116 provides a storage location for data and/or metadata obtained from various operations performed by the server system 102.
Further, it may be noted that, in an example, the server system 102 coupled with the database 116 is embodied within the payment server 114, however, in other examples, the server system 102 can be a standalone component (acting as a hub) connected to the issuer servers 108 and the acquirer servers 110. The database 116 may be incorporated in the server system 102 or maybe an individual entity connected to the server system 102 or maybe a database stored in cloud storage.
In various non-limiting examples, the database 116 may include one or more Hard Disk Drives (HDD), Solid-State Drives (SSD), an Advanced Technology Attachment (ATA) adapter, a Serial ATA (SATA) adapter, a Small Computer System Interface (SCSI) adapter, a redundant array of independent disks (RAID) controller, a Storage Area Network (SAN) adapter, a network adapter, and/or any component providing the server system 102 with access to the database 116. In one implementation, the database 116 may be viewed, accessed, amended, updated, and/or deleted by an administrator (not shown) associated with the server system 102 through a database management system (DBMS) or relational database management system (RDBMS) present within the database 116.
In another embodiment, the server system 102 is configured to access the sequence-related features 120 from the database 116. The sequence-related features 120 may correspond to one or more card-specific sequences of payment transactions performed by the plurality of cardholders 104 at the plurality of merchants 106. Herein, each card-specific sequence corresponds to a particular cardholder (e.g., the cardholder 104(1)) from the plurality of cardholders 104.
Upon accessing the sequence-related features 120, the server system 102 may further be configured to generate a plurality of sequence-related embeddings based, at least in part, on the sequence-related features 120. Each sequence-related embedding corresponds to each card-specific sequence. In a non-limiting example, the one or more AI or ML models may be used by the server system 102 for generating the plurality of sequence-related embeddings. Further, in a specific example, the first ML model 124 may be used for generating the plurality of sequence-related embeddings. In an embodiment, the first ML model 124 may be a deviation-based Marked Temporal Point Process (MTPP) model.
In some non-limiting embodiments, the server system 102 may be configured to generate the first ML model 124. It is noted that the process of generating the first ML model 124 is explained further in the present disclosure with reference to FIG. 6.
Further, the server system 102 may be configured to generate a merchant-specific embedding for each merchant of the plurality of merchants 106 based, at least in part, on aggregating a relevant set of embeddings of the plurality of sequence-related embeddings. Herein, in an embodiment, the relevant set of embeddings may indicate one or more sequence-related embeddings of the plurality of sequence-related embeddings corresponding to a particular merchant. Moreover, in a non-limiting example, the aggregation of the relevant set of embeddings may be performed using a predefined aggregation technique. In a specific example, the predefined aggregation technique may include taking a mean or an average of the corresponding relevant set of embeddings. Further, the merchant-specific embedding for each merchant may be generated from an aggregated embedding obtained from the aggregation of the relevant set of embeddings. It is noted that a detailed process of generating the merchant-specific embedding is explained further in the present disclosure with reference to FIG. 4A.
In another embodiment, the server system 102 is configured to access the merchant velocity features 122 from the database 116 and generate a plurality of merchant velocity embeddings based, at least in part, on the merchant velocity features 122 corresponding to each merchant. Each merchant velocity embedding corresponds to each payment transaction performed at the corresponding merchant. In a non-limiting example, the one or more AI or ML models may be used by the server system 102 for generating the plurality of merchant velocity embeddings. Further, in a specific example, the second ML model 126 may be used for generating the plurality of merchant velocity embeddings. In an embodiment, the second ML model 126 may be a Gradient Boosting Machine (GBM) model.
In some non-limiting embodiments, the server system 102 may be configured to generate the second ML model 126. It is noted that the process of generating the second ML model 126 is explained further in the present disclosure with reference to FIG. 7.
The server system 102 may further be configured to generate a set of concatenated merchant-related embeddings for each merchant based, at least in part, on concatenating the merchant-specific embedding with the plurality of merchant velocity embeddings associated with each merchant. As may be understood, for improving the capability of any AI or ML model to learn more precisely about merchant behavior patterns, along with merchant velocity features, sequence-related features corresponding to cardholders that transacted at the corresponding merchant may also be needed. Thus, in a non-limiting example, to enable this, the concatenation of respective embeddings with each other is performed using a concatenation technique. As used herein, the term “concatenation technique” refers to a process of appending one string to the end of another string. It is noted that a detailed process of generating the set of concatenated merchant-related embeddings is explained further in the present disclosure with reference to FIG. 4B.
Further, the server system 102 may be configured to determine a merchant behavior pattern corresponding to each merchant, based, at least in part, on the set of concatenated merchant-related embeddings. More specifically, the server system 102 may determine the merchant behavior pattern based, at least in part, on the set of concatenated merchant-related embeddings and a predetermined collusion threshold. Herein, the term ‘predetermined collusion threshold’ refers to a threshold by which the merchant behavior pattern of each merchant is permitted to deviate from a predicted merchant behavior pattern. In an embodiment, upon determining the merchant behavior pattern, it may be segregated into a collusive behavior and a non-collusive behavior.
In one embodiment, the payment network 112 may be used by the payment card issuing authorities as a payment interchange network. Examples of payment interchange networks include but are not limited to, Mastercard® payment system interchange network. The Mastercard® payment system interchange network is a proprietary communications standard promulgated by Mastercard International Incorporated® for the exchange of electronic payment transaction data between issuers and acquirers that are members of Mastercard International Incorporated®. (Mastercard is a registered trademark of Mastercard International Incorporated located in Purchase, N.Y.).
The number and arrangement of systems, devices, and/or networks shown in FIG. 1 are provided as an example. There may be additional systems, devices, and/or networks; fewer systems, devices, and/or networks; different systems, devices, and/or networks; and/or differently arranged systems, devices, and/or networks than those shown in FIG. 1. Furthermore, two or more systems or devices shown in FIG. 1 may be implemented within a single system or device, or a single system or device is shown in FIG. 1 may be implemented as multiple, distributed systems or devices. In addition, the server system 102 should be understood to be embodied in at least one computing device in communication with the network 118, which may be specifically configured, via executable instructions, to perform steps as described herein, and/or embodied in at least one non-transitory computer-readable media.
FIG. 2 illustrates a simplified block diagram of a server system 200, in accordance with an embodiment of the present disclosure. The server system 200 is identical to the server system 102 of FIG. 1. In one embodiment, the server system 200 is a part of the payment network 112 or integrated within the payment server 114. In some embodiments, the server system 200 is embodied as a cloud-based and/or SaaS-based (software as a service) architecture.
The server system 200 includes a computer system 202 and a database 204. The computer system 202 includes at least one processor 206 for executing instructions, a memory 208, a communication interface 210, a user interface 212, and a storage interface 214. The one or more components of the computer system 202 communicate with each other via a bus 216. The components of the server system 200 provided herein may not be exhaustive and the server system 200 may include more or fewer components than those depicted in FIG. 2. Further, two or more components depicted in FIG. 2 may be embodied in one single component, and/or one component may be configured using multiple sub-components to achieve the desired functionalities.
In some embodiments, the database 204 is integrated into the computer system 202. In one non-limiting example, the database 204 is configured to store a cardholder-related dataset and a merchant-related dataset. In a non-limiting example, the cardholder-related dataset may include historical information corresponding to various or plurality of payment transactions performed by the plurality of cardholders 104 at the plurality of merchants 106 in the payment ecosystem. In another non-limiting example, the merchant-related dataset may include historical information corresponding to various or plurality of payment transactions performed at each merchant of the plurality of merchants 106 in the payment ecosystem.
In a non-limiting embodiment, the server system 200 is configured to access the cardholder-related dataset from the database 204, generate a plurality of sequence-related features such as sequence-related features 218, and store the sequence-related features 218 in the database 204. In another embodiment, the server system 200 is configured to access the merchant-related dataset from the database 204, generate a plurality of merchant velocity features such as merchant velocity features 220, and store the merchant velocity features 220 in the database 204. Furthermore, in an embodiment, the database 204 may also store one or more AI or ML models such as a first ML model 222, and a second ML model 224 that may be used by the server system 102 for further processing the sequence-related features 218 and the merchant velocity features 220. Herein, the sequence-related features 218, the merchant velocity features 220, the first ML model 222, and the second ML model 224 are identical to the sequence-related features 120, the merchant velocity features 122, the first ML model 124, and the second ML model 126 respectively of FIG. 1.
Further, the computer system 202 may include one or more hard disk drives as the database 204. The user interface 212 is an interface such as a Human Machine Interface (HMI) or a software application that allows users such as an administrator to interact with and control the server system 200 or one or more parameters associated with the server system 200. It may be noted that the user interface 212 may be composed of several components that vary based on the complexity and purpose of the application. Examples of components of the user interface 212 may include visual elements, controls, navigation, feedback and alerts, user input and interaction, responsive design, user assistance and help, accessibility features, and the like. More specifically these components may correspond to icons, layout, color schemes, buttons, sliders, dropdown menus, tabs, links, error/success messages, mouse and touch interactions, keyboard shortcuts, tooltips, screen readers, and the like.
The storage interface 214 is any component capable of providing the processor 206 access to the database 204. The storage interface 214 may include, for example, an Advanced Technology Attachment (ATA) adapter, a Serial ATA (SATA) adapter, a Small Computer System Interface (SCSI) adapter, a RAID controller, a SAN adapter, a network adapter, and/or any component providing the processor 206 with access to the database 204.
The processor 206 includes suitable logic, circuitry, and/or interfaces to execute operations for generating a merchant-specific embedding from a plurality of sequence-related embeddings, generating a set of concatenated merchant-related embeddings, determining a merchant behavior pattern from the set of concatenated merchant-related embeddings, performing fraud detection based on the merchant behavior pattern, and the like. Examples of the processor 206 include, but are not limited to, an Application-Specific Integrated Circuit (ASIC) processor, a Reduced Instruction Set Computing (RISC) processor, a Graphical Processing Unit (GPU), a complex instruction set computing (CISC) processor, a Field-Programmable Gate Array (FPGA), and the like.
The memory 208 includes suitable logic, circuitry, and/or interfaces to store a set of computer-readable instructions for performing operations. Examples of the memory 208 include a Random-Access Memory (RAM), a Read-Only Memory (ROM), a removable storage drive, a Hard Disk Drive (HDD), and the like. It will be apparent to a person skilled in the art that the scope of the disclosure is not limited to realizing the memory 208 in the server system 200, as described herein. In another embodiment, the memory 208 may be realized in the form of a database server or a cloud storage working in conjunction with the server system 200, without departing from the scope of the present disclosure.
The processor 206 is operatively coupled to the communication interface 210, such that the processor 206 is capable of communicating with a remote device 226 such as the issuer servers 108, the acquirer servers 110, the payment server 114, or communicating with any entity connected to the network 118 (as shown in FIG. 1).
It is noted that the server system 200 as illustrated and hereinafter described is merely illustrative of an apparatus that could benefit from embodiments of the present disclosure and, therefore, should not be taken to limit the scope of the present disclosure. It is noted that the server system 200 may include fewer or more components than those depicted in FIG. 2. It should be noted that the server system 200 is identical to the server system 102 described in reference to FIG. 1.
In one implementation, the processor 206 includes a data pre-processing module 228, a training module 230, a generation module 232, an aggregation module 234, a concatenation module 236, and a labeling module 238. It should be noted that components, described herein, such as the data pre-processing module 228, the training module 230, the generation module 232, the aggregation module 234, the concatenation module 236, and the labeling module 238 can be configured in a variety of ways, including electronic circuitries, digital arithmetic, and logic blocks, and memory systems in combination with software, firmware, and embedded technologies. Moreover, it may be noted that the data pre-processing module 228, the training module 230, the generation module 232, the aggregation module 234, the concatenation module 236, and the labeling module 238 may be communicably coupled with each other to exchange information with each other for performing the one or more operations facilitated by the server system 200.
In an embodiment, the data pre-processing module 228 includes suitable logic and/or interfaces for accessing the cardholder-related dataset from the database 204. The cardholder-related dataset may include historical information corresponding to various or plurality of payment transactions performed by the plurality of cardholders 104 at the plurality of merchants 106 using a plurality of payment cards in the payment ecosystem.
In a specific embodiment, the historical information corresponding to the various or a plurality of payment transactions may include information related to payment transactions that may have been performed within a predetermined interval of time (e.g., 6 months, 12 months, 24 months, etc.). In another embodiment, the historical information may include information on both fraudulent and non-fraudulent payment transactions performed within the payment network 112.
In some non-limiting examples, the historical information includes information related to at least merchant name identifier, unique merchant identifier, timestamp information, geo-location related data, Merchant Category Code (MCC), information related to payment instruments involved in the set of historical payment transactions, cardholder identifier, Permanent Account Number (PAN), merchant DBA name, country code, transaction identifier and the like.
In another embodiment, the historical information may include information related to past payment transactions such as transaction date, transaction time, geo-location of a transaction, transaction amount, transaction label (e.g., fraudulent or non-fraudulent), and the like.
The data pre-processing module 228 may further be configured to generate transaction sequence data corresponding to each cardholder including the one or more card-specific sequences from the cardholder-related dataset. Each card-specific sequence may correspond to a sequence of payment transactions linked with a time of occurrence and a predetermined label and performed by a particular cardholder at the plurality of merchants. For example, a card-specific sequence at a merchant 104(2) may include the last five transactions of the cardholder 104(2) that are performed at a particular merchant such as a merchant106(2). To that end it may be understood that each cardholder may have multiple card-specific sequences, each card-specific sequence being different for different merchants. In a specific embodiment, the transaction sequence data may further include a merchant type, a sequence pattern description, a transaction type, a cardholder identity (ID), a merchant ID, a count of transactions in the sequence, a time stamp associated with each payment transaction in the sequence, a label associated with each payment transaction in the sequence, and the like.
The data pre-processing module 228 further generates the sequence-related features 218 corresponding to each cardholder for each card-specific sequence based, at least in part, on the transaction sequence data. In a non-limiting example, the server system 200 may store the sequence-related features 218 in the database 204. It should be noted that the sequence-related features 218 may be accessed from the database 204 whenever required.
In an example embodiment, the sequence-related features 218 may include at least: a timestamp of payment transactions, a location flag, a payment card status flag, a transaction status flag, a sequence type, label information, a Decision Intelligence (DI) score, and the like. Herein, the term ‘timestamp of payment transactions’ indicates the time at which the payment transaction was performed by the cardholder (e.g., the cardholder 104(1)).
Further, the term ‘location flag’ indicates whether the payment transaction is being performed at a merchant in the same country as the cardholder (i.e., also known as domestic transactions). In an embodiment, it may be noted that usually, cross-border transactions where the merchant and cardholder belong to different countries have a higher risk of being fraudulent.
Furthermore, the term ‘payment card status flag’ may indicate whether the cardholder is present at the point of service at transaction time. In an embodiment, the payment card status flag may include one of a Card Not Present (CNP) flags or a Card present flag tagged to a payment transaction. Examples of payment transactions that are flagged with the CNP flag includes online payments, phone/mail order, recurring transactions, etc. CNP transactions are more likely to be online.
Moreover, the term ‘transaction status flag’ may denote whether the transaction request was allowed by the issuer or not. In an instance, this means that the transaction status flag indicates whether the payment transaction was completed, and the merchant received the amount from the cardholder or not. In an embodiment, the transaction status flag may include one of a transaction accepted flag and a transaction declined flag. In some examples, common reasons for declining the payment transactions include anticipation of fraud, a breach in limits set on transaction amount/count, and the like.
Furthermore, the term ‘sequence type’ may indicate whether the sequence has a regular time interval or an irregular time interval. Similarly, the term ‘label information’ refers to information related to labels that may have been assigned for past transactions. In an embodiment, the label, in case of fraud detection, may include one of a fraudulent transaction flag and a non-fraudulent transaction flag.
Further, the term ‘DI score’ indicates a metric that assesses a risk associated with a payment transaction in real time and provides the score to the issuer at the time of authorization. In a non-limiting example, it is understood that the DI score is a Mastercard® product. In a specific example, it may be noted that the DI scores lie in a range of 0-999 with higher values signifying an increased probability of the payment transaction being fraudulent.
In another non-limiting example, some of other players provide a similar score namely ‘advanced identity score’. Herein, the term ‘advanced identity score’ refers to a score that combines AI and predictive AI capabilities of such other players with application and identity-related data to generate a risk score for new account applications to help reduce fraud, prevent negative impact on brand loyalty and trust, and eliminate operational costs due to remediation.
In a particular implementation of the present disclosure, the sequence-related features 218 may further include a binned amount which indicates the amount of a transaction. In an embodiment, it may be noted that while the amount is inherently a continuous feature, it can be binned into 5 buckets such as [0-10), [10,100), [100, 1000), [1000, 10000), [10000, 8). Herein, it is understood that ‘8’ indicates any amount greater than 10000. Herein, binning may not be limited to five buckets as the transaction amount can be binned in any number of buckets without limiting the scope of the invention.
Further, the generation module 232 includes suitable logic and/or interfaces accessing a plurality of sequence-related features (e.g., the sequence-related features 218) from a database (e.g., the database 204). The generation module 232 is further configured to generate a plurality of sequence-related embeddings based, at least in part, on the sequence-related features 218. Each sequence-related embedding corresponds to each card-specific sequence. In a non-limiting example, the one or more AI or ML models may be used by the generation module 232 for generating the plurality of sequence-related embeddings. Further, in a specific example, the first ML model 222 may be used for generating the plurality of sequence-related embeddings. In an embodiment, the first ML model 222 may be a deviation-based Marked Temporal Point Process (MTPP) model. It may be noted that the plurality of sequence-related embeddings is provided to the aggregating module 234.
Further, in some non-limiting embodiments, the training module 230 includes suitable logic and/or interfaces for generating the first ML model 222. It may be noted that the training module 230 may generate the first ML model 222 by performing a first set of operations iteratively until the performance of the first ML model 222 converges to first predefined criteria.
In a specific embodiment, the first set of operations may include: (1) initializing the first ML model 222 based on one or more first model parameters, (2) predicting, via the first ML model 222, an occurrence of a future payment transaction based on the plurality of sequence-related embeddings, (3) computing a deviation between the predicted occurrence and an actual occurrence of the future payment transaction, and (4) optimizing the one or more first model parameters. Then the process repeats with optimized first model parameters until the performance of the first ML model 222 converges to the first predefined criteria. It is noted that the process of generating the first ML model 222 is explained in detail in the present disclosure with reference to FIG. 6.
Further, the aggregation module 234 includes suitable logic and/or interfaces for determining a relevant set of embeddings for each merchant based, at least in part, on the plurality of sequence-related embeddings. In a non-limiting example, the relevant set of embeddings indicates one or more sequence-related embeddings of the plurality of sequence-related embeddings corresponding to a particular merchant.
The aggregation module 234 may further be configured to aggregate the relevant set of embeddings for each merchant using a predefined aggregation technique. Furthermore, the aggregation module 234 may be configured to generate a merchant-specific embedding for each merchant of the plurality of merchants 106 based, at least in part, on aggregating the relevant set of embeddings for each merchant.
In an example scenario, the cardholder 104(3) may have transacted multiple times at each of three merchants such as the merchant 106(1), the merchant 106(2), and a merchant 106(3). To that end, the cardholder 104(3) can have three card-specific sequences which may be the same or different from each other. Herein, each card-specific sequence has a sequence-related embedding. Further, each merchant of the merchants 106(1)-106(3) can have multiple sequence-related embeddings related to different cardholders 104 that transacted at the corresponding merchants 106(1)-106(3). In such a scenario, the aggregation module 234 may determine sequence-related embeddings addressed as the relevant set of embeddings, and then generate a merchant-specific embedding for each merchant based, at least in part, on aggregating the relevant set of embeddings for each merchant. In an embodiment, the merchant-specific embeddings for the plurality of merchants 106 are provided to the concatenation module 236.
In a non-limiting example, the predefined aggregation technique refers to performing a mean or an average of the corresponding relevant set of embeddings. In the above-mentioned example, the aggregation of the relevant set of embeddings may correspond to the mean or average of a set of payment transactions corresponding to each cardholder, followed by the mean or average of the relevant set of embeddings for each merchant.
In another non-limiting example, the predefined aggregation technique refers to performing an average pooling process across multiple payment transactions for each payment card associated with each cardholder. This is followed by performing an average pooling across multiple payment cards associated with multiple cardholders for each payment transaction.
In another embodiment, the data pre-processing module 228 is further configured to access the merchant-related dataset from the database 204. In an embodiment, the merchant-related dataset may include historical information corresponding to various or plurality of payment transactions performed at each merchant of the plurality of merchants 106 in the payment ecosystem.
In a specific embodiment, the historical information may include information related to the acquirer servers 110 such as the date of merchant registration with the acquirer servers 110, amount of payment transactions performed at the acquirer servers 110 in a day, number of payment transactions performed at the acquirer servers 110 in a day, maximum transaction amount, minimum transaction amount, number of fraudulent merchants or non-fraudulent merchants registered with the acquirer servers 110, and the like.
The data pre-processing module 228 further generates a plurality of merchant velocity features (e.g., the merchant velocity features 220) based, at least in part, on the merchant-related dataset. Further, the data pre-processing module 228 may store the merchant velocity features 220 for each merchant in the database 204. In should be noted that the merchant velocity features 220 may be accessed from the database 204 whenever required.
Further, in an embodiment, the generation module 232 accesses the merchant velocity features 220 from the database 204 and generates a plurality of merchant velocity embeddings for each merchant based, at least in part, on the merchant velocity features 220. Herein, each merchant velocity embedding corresponds to each payment transaction performed at the corresponding merchant. In an embodiment, the plurality of merchant velocity embeddings is also provided to the concatenation module 236.
In a non-limiting example, the one or more AI or ML models may be used by the generation module 232 for generating the plurality of merchant velocity embeddings. Further, in a specific example, the second ML model 224 may be used for generating the plurality of merchant velocity embeddings. In an embodiment, the second ML model 224 may include a Gradient Boosting Machine (GBM) model.
Further, in some non-limiting embodiments, the training module 230 may be configured to generate the second ML model 224. It may be noted that the training module 230 may generate the second ML model 224 by performing a second set of operations iteratively until the performance of the second ML model 224 converges to second predefined criteria.
In a specific embodiment, the second set of operations may include: (1) initializing the second ML model 224 based on one or more second model parameters, (2) generating a decision tree, (3) computing optimized loss function and other optimization parameters, (4) generating an updated decision tree, (5) assigning weights to each of the decision trees, (6) obtaining an ensemble of the decision trees, and (7) optimizing the one or more second model parameters. Then the process repeats with optimized second model parameters until the performance of the second ML model 224 converges to the second predefined criteria. It may be noted that the process of generating the second ML model 224 is explained in detail in the present disclosure with reference to FIG. 7.
Moreover, the concatenation module 236 includes suitable logic and/or interfaces for generating a set of concatenated merchant-related embeddings for each merchant of the plurality of merchants 106 based, at least in part, on concatenating the merchant-specific embedding with the plurality of merchant velocity embeddings associated with each merchant.
In an embodiment, for generating the set of concatenated merchant-related embeddings, the concatenation module 236 initially concatenates the plurality of merchant velocity embeddings with each other. In a non-limiting example, the concatenation module 236 concatenates the plurality of merchant velocity embeddings with each other using the concatenation technique. Further, the concatenation module 236 generates a concatenated merchant velocity embedding for each merchant based, at least in part, on concatenating the plurality of merchant velocity embeddings with each other.
Furthermore, the concatenation module 236 concatenates the merchant-specific embedding with the concatenated merchant velocity embedding for each merchant 106. Later, the concatenation module 236 generates the set of concatenated merchant-related embeddings based, at least in part, on concatenating the merchant-specific embedding and the plurality of merchant velocity embeddings associated with each merchant. In an embodiment, the set of concatenated merchant-related embeddings may be provided to the labeling module 238 and the generation module 232.
The generation module 232 includes suitable logic and/or interfaces for determining a merchant behavior pattern corresponding to each merchant based, at least in part, on the set of concatenated merchant-related embeddings. More specifically, the generation module 232 may determine the merchant behavior pattern based, at least in part, on the set of concatenated merchant-related embeddings and a predetermined collusion threshold. Herein, the term ‘predetermined collusion threshold’ refers to a threshold by which the merchant behavior pattern of each merchant is permitted to deviate from a predicted merchant behavior pattern. In a non-limiting example, the generation module 232 may determine the merchant behavior pattern using the one or more AI or ML models. More specifically, the second ML model 224 may be used for determining the merchant behavior pattern.
In an embodiment, for determining the merchant behavior pattern, the generation module 232 identifies the merchant behavior pattern corresponding to each merchant 106 based, at least in part, on the set of concatenated merchant-related embeddings. Further, the generation module 232 accesses a predicted merchant behavior pattern, and the identified merchant behavior pattern is compared with the predicted merchant behavior pattern.
The generation module 232 further generates and assigns a collusion score to at least one merchant of the plurality of merchants 106 based, at least in part, on comparing the identified merchant behavior pattern with the predicted merchant behavior pattern.
Further, the labeling module 238 includes suitable logic and/or interfaces for labeling the at least one merchant of the plurality of merchants 106 based, at least in part, on the collusion score. In a non-limiting example, the labeling module 238 labels the at least one merchant as a collusive merchant based, at least in part, the collusion score being at least equal to a predetermined collusion threshold. In another non-limiting example, the labeling module 238 labels the at least one merchant of the plurality of merchants as a non-collusive merchant based, at least in part, the collusion score being less than the predetermined collusion threshold.
It may be noted that the predetermined collusion threshold may be determined while training the second ML model 224 via the training module 230. Later, in an embodiment, the labeling module 238 compares the collusion score of each merchant with the predetermined collusion threshold. Upon comparison, if the collusion score is greater than or equal to the predetermined collusion threshold, the labeling module 238 labels the corresponding merchant as the collusive merchant. Alternatively, the labeling module 238 labels the corresponding merchant as the non-collusive merchant.
In an example embodiment, the generation module 232 is also configured to predict an occurrence of a future payment transaction, identify an actual occurrence of the future payment transaction, and determine a deviation between the two. Further, in an embodiment, the labeling module 238 labels the future payment transaction as a fraudulent payment transaction or a non-fraudulent payment transaction based on the deviation and a predefined fraudulent threshold. It may be noted that the labeling module 238 is facilitated by the first ML model 222 to identify fraudulent payment transactions. Further, the process of labeling fraudulent transactions is explained in detail with reference to FIG. 5.
In another example embodiment, a performance of the second ML model 224 may be determined based on performance metrics associated with the second ML model 224. For example, in the case of the second ML model 224 is a Gradient Boosting Machine (GBM) model, the term ‘performance metrics’ refers to a set of parameters that are measured to evaluate the performance of a model, wherein the model may be used for a classification task or for a regression task. Examples of the performance metrics for a classification task include accuracy, precision, Recall (Sensitivity or True Positive Rate), F1-score, Area under the Receiver Operating Characteristic (ROC) curve (AUC-ROC), Area Under the Precision-Recall Curve (AUC-PR), confusion matrix, and the like. Similarly, examples of the performance metrics for a regression task include Mean Absolute Error (MAE), Mean Squared Error (MSE), Root Mean Squared Error (RMSE), and the like.
In an example embodiment of the present disclosure, the second ML model 224 may be the GBM model for the classification task for classifying the merchants as a collusive merchant or a non-collusive merchant. Thus, the performance metrics for the classification task are considered. To that note, it may be understood that a first set of performance metrics includes the above-mentioned performance metrics generated for the second ML model 224 based on the plurality of merchant velocity embeddings. Similarly, a second set of performance metrics includes the above-mentioned performance metrics generated for the second ML model 224 based on the set of concatenated merchant-related embeddings. The first set of performance metrics and the second set of performance metrics may be compared, and the performance of the second ML model 224 is determined based on the comparison. In an embodiment, if the values corresponding to the second set of performance metrics are better than those of the first set of performance metrics, then the performance of the second ML model 224 is determined to be improved.
It may be noted that to prove that the performance of the second ML model 224 is improved upon the introduction of the sequence-related embeddings for merchant fraud detection, an experiment is conducted using the following data:
Train Val Test
Transactions 11M 629k 1.1M
Merchants 2M 88k 140k
Payment cards 568k 29k 50k
Fraud % 30.7 40.4 44.1
Table 1: Transaction data of payment cards with high fraud rate in the Europe region
As shown in illustrative Table 1, data in the column indicated by the term ‘train’ refers to data used for training the first ML model 222 and the second ML model 224. Similarly, data in the column indicated by the term ‘val’ refers to data used for validation of the first ML model 222 and the second ML model 224. Further, data in the column indicated by the term ‘test’ refer to data used for testing the first ML model 222 and the second ML model 224. Thus, it may be understood that the cardholder-related dataset (i.e., transaction data of the payment cards shown in Table 1) is split into a training dataset (referred to as ‘train’ in table 1), a validation dataset (referred to as ‘val’ in table 1), and a testing dataset (referred to as ‘test’ in table 1).
Moreover, the cardholder-related dataset includes a count of transactions performed at a group of merchants via multiple payment cards, fraud percentage, and the like. The server system 200 generates the sequence-related features 218 from the cardholder-related dataset and stores them in the database 204 via the data pre-processing module 228. Later, the server system 200 accesses the sequence-related features 218 and generates a plurality of sequence-related embeddings for each card-specific sequence via the generation module 232.
Further, a relevant set of embeddings is aggregated from the plurality of sequence-related embeddings via the aggregation module 234 and concatenated with each other via the concatenation module 236. Such embeddings may be referred to as TPP embeddings as the first ML model 222 that is used for generating such embeddings is a deviation-based MTPP model.
Parallelly, the server system 200 accesses the merchant velocity features 220 from the database 204 via the generation module 232, and the merchant velocity embeddings are generated for the same. These embeddings along with the TPP embeddings are concatenated with each other for generating the set of concatenated merchant-related embeddings via the concatenation module 236. These embeddings may be referred to as testing embeddings.
Further, for validation purposes, a real-world dataset may be considered which is as follows:
Test
Transactions 14.8M
Merchants 60k
Cards 399k
Fraud % 0.00013
Table 2: Real-world dataset for evaluation of model performance
As shown in illustrative Table 2, the real-world dataset may be used for validation of the first ML model 222 and the second ML model 224. Herein, the first ML model 222 and the second ML model 224 are first tested using the testing embeddings and then validated with the real-world dataset. Upon doing so, in a non-limiting implementation, the following results are obtained:
Genuine Collusive Genuine Collusive
Genuine TN
56893 FP
1456 Genuine TN
57110 FP
1239
Collusive FN
1034 TP
1359 Collusive FN
1029 TP
1364
Table 3A: Results without TPP embeddings Table 4B: Results with TPP embeddings

AUC 0.95
AUC-PR 0.50
Precision 0.48
Recall 0.56
Table 4A: Performance metrics without TPP embeddings
AUC 0.959
AUC-PR 0.558
Precision 0.52
Recall 0.569
Table 4B: Performance metrics with TPP embeddings
Referring to illustrative Tables 3A and 3B, a relative reduction in the count of False Positives (FP) can be seen upon the introduction of the TPP embeddings which is a desirable factor and adds to relatively improving the performance of the second ML model 224.
Further, as may be gathered from illustrative Tables 4A and 4B, a relative improvement in the one or more of the performance metrics can also be seen. For example, the AUC-PR has approximately increased from about 0.50 to about 0.558 (percentage equivalents being, about 50 percent (%) to about 55.8 %, i.e., approximately 6 %) which is another desirable factor adding to improvement in the overall performance of the second ML model 224 due to the addition of the TPP embeddings. Similarly, other performance metrics such as the AUC, precision, and recall have also increased approximately by about 0.009 (nearly 1%), 0.4 (nearly 40%), and 0.009 (nearly 1%) respectively. Thus, it may be observed that practically, the values of the performance metrics range from 0 to 1, and as the value for each performance metric moves away from 0 and close to 1, the performance of the second ML model 224 improves.
FIG. 3 illustrates a block diagram representation 300 depicting a detailed flow of performing fraud detection from the set of concatenated merchant-related embeddings, in accordance with an embodiment of the present disclosure. As may be understood, in a non-limiting example, the one or more card-specific sequences such as the card-specific sequences 302 are provided to the first ML model 222 as input data. The data pre-processing module 228 facilitates the generation of the card-specific sequences 302.
In a specific embodiment, it may be noted that the architecture of the first ML model 222 is built using a Long short-term memory (LSTM) model 304, a Feed Forward (FF) network 306, and an FF-deviation network 308 as the first ML model 222 is the deviation-based MTPP model for fraud detection. In some embodiments, other architectures such as the one including transformers, etc., can also be used for the implementation of the first ML model 222, without limiting the scope of the invention.
Further, in a non-limiting example, it may be understood that the LSTM model 304 may receive the card-specific sequences 302 as input and learn temporal dependencies within the card-specific sequences 302 that are associated with labels. As used herein, the term ‘temporal dependency’ refers to a factor that involves the impact of precious behavior on current behavior.
Further, the FF network 306 and the FF-deviation network 308 together may predict an occurrence of a future payment transaction in case of fraud detection, and determine a deviation between the predicted occurrence and an actual occurrence of the future payment transaction. Ultimately, a fraudulent transaction label such as the fraudulent transaction label 310A or the non-fraudulent transaction label 310B may be generated and assigned to payment transactions based on the deviation and the predefined fraudulent threshold. In a non-limiting example, the predefined fraudulent threshold may be determined using an inbuilt NumPy function such as ‘numpy.argmax’.
In an embodiment, the server system 200 may generate the first ML model 222 via the training module 230, and the process of generating the first ML model 222 is explained with reference to FIG. 6. Also, the process of determining fraudulent transactions via the first ML model 222 is explained with reference to FIG. 5.
To that note, during the process of determining the fraudulent transactions, a time factor (interchangeably also referred to as ‘time’, see, 312) may be obtained as an intermediate output. The server system 200 may calculate the inter-event time (e.g., a time difference between an event such as the current transaction and the future transaction) and express this in the unit ‘days’.
In an embodiment, another intermediate output that may be obtained during the above-mentioned process may include sequence-related embeddings such as the sequence-related embeddings 314. Further, in an embodiment, the relevant set of embeddings is aggregated from the sequence-related embeddings 314 via the aggregation module 234 for generating a merchant-specific embedding such as the merchant-specific embedding 316.
In another embodiment, merchant velocity features such as velocity features 318 may be accessed from the database 204 via the generation module 232. The generation module 232 further generates merchant velocity embeddings 320 from the velocity features 318 via the second ML model 224. In an embodiment, the second ML model 224 may be generated by the server system 200 via the training module 230. The process of generating the second ML model 224 is explained with reference to FIG. 7.
Further, the merchant velocity embeddings 320 along with the merchant-specific embedding 316 for each merchant are provided to the concatenation module 236. The concatenation module 236 concatenates the merchant velocity embeddings 320 with each other and then concatenates the result with the merchant velocity embeddings 320, thereby obtaining the set of concatenated merchant-related embeddings for each merchant such as the concatenated merchant-related embeddings 322. Finally, the merchant behavior pattern may be monitored via the generation module 232 and the merchants may be segregated by assigning a collusive merchant label 324A or a non-collusive merchant label 324B via the labeling module 238, based on a deviation of the merchant behavior pattern of each merchant as shown in FIG. 3.
FIG. 4A illustrates a process flow diagram 400 depicting a process of generating a merchant-specific embedding (e.g., the merchant-specific embedding 316) from a plurality of sequence-related embeddings (e.g., the sequence-related embeddings 218), in accordance with an embodiment of the present disclosure. As may be understood, the server system 200 generates the merchant-specific embedding 316 from the sequence-related embeddings 218 via the aggregation module 234. Further, it should be understood that as an embedding corresponds to a rich representation of features, the sequence-related embeddings 218 may be generated from features related to a sequence. Herein, the sequence may correspond to a sequence of payment transactions for a specific merchant, which may be generated from data related to the payment transactions. Herein, it may be noted that the payment transactions may have been performed by a plurality of cardholders 104 at a plurality of merchants 106. Further, in an example implementation, the present disclosure intends to represent a merchant (e.g., the merchant 106(1)) using data of cardholders (e.g., the cardholders 104(1)-104(4)) that transacted at that particular merchant 106(1), the merchant-specific embedding 316 is generated from the sequence-related embeddings 218. Thus, for generating the merchant-specific embedding 316, one or more operations may be performed by the server system 200. The process starts from operation 402.
At operation 402, the server system 200 accesses the cardholder-related dataset. In an embodiment, the server system 200 accesses, via the data pre-processing module 228, the cardholder-related dataset from the database 204. The cardholder-related dataset may include historical information corresponding to various or plurality of payment transactions performed by the plurality of cardholders (e.g., the cardholders 104(1)-104(4)) at the plurality of merchants (e.g., the merchant 106(1)) using a plurality of payment cards in the payment ecosystem. In an example, the cardholder-related dataset may also include real-time transaction data of the plurality of cardholders (e.g., the cardholders 104(1)-104(4)) and the plurality of merchants (e.g., the merchant 106(1)). The transaction data may include, but is not limited to, transaction attributes, such as transaction amount, a time stamp associated with the payment transactions indicating the timing of occurrence of the payment transactions, source of funds such as bank or credit cards, transaction channel used for loading funds such as POS terminal or ATM machine, transaction velocity features such as count and transaction amount sent in the past ‘x’ number of days to a particular user, transaction location information, external data sources, and other internal data to evaluate each transaction.
At operation 404, the server system 200 generates transaction sequence data. In an embodiment, the server system 200 generates, via the data pre-processing module 228, the transaction sequence data corresponding to each cardholder of the plurality of cardholders 104 based, at least in part, on the cardholder-related dataset and a time stamp associated with each payment transaction. In an embodiment, the transaction sequence data may include one or more card-specific sequences of payment transactions corresponding to each cardholder. Herein, each card-specific sequence may correspond to a sequence of payment transactions linked with a time of occurrence and a predetermined label and performed by a particular cardholder at the plurality of merchants 106.
In an embodiment, the predetermined label in case of fraud detection may be one of a fraudulent transaction label and a non-fraudulent transaction label. In some embodiments, it may be noted that the sequence may not occur at regular time intervals. For instance, a cardholder (e.g., the cardholder 104(1)) might have withdrawn money during the nighttime on one day, the next transaction being on the next day morning, followed by a transaction after two days in the afternoon for making a certain purchase. Since the cardholder 104(1) makes payment transactions for personal needs that may occur randomly, the order of payment transactions is also random and does not follow any particular order at regular intervals. Thus, it can be stated that each card-specific sequence has a sequence of payment transactions with irregular time intervals that may be specific to a particular merchant and performed by a particular cardholder. In this sense, it is understandable that the sequence may vary from merchant to merchant and from cardholder to cardholder.
At operation 406, the server system 200 generates a plurality of sequence-related features. In an embodiment, the server system 200 generates, via the data pre-processing module 228, the plurality of sequence-related features (e.g., the sequence-related features 120) corresponding to each cardholder of the plurality of cardholders 104 for each card-specific sequence of the one or more card-specific sequences based, at least in part, on the transaction sequence data. In an example embodiment, the sequence-related features 120 may include at least: the time of payment transactions, Decision Intelligence (DI) score, location flag, payment card status flag, transaction status flag, a sequence type, label information, and the like.
At operation 408, the server system 200 stores the sequence-related features 120. In an embodiment, for the server system 200 stores, via the data pre-processing module 228, the sequence-related features 120 for each cardholder and for each card-specific sequence in the database 204.
At operation 410, the server system 200 accesses the sequence-related features 120. In an embodiment, the server system 200 accesses, via the generation module 232, the sequence-related features 120 from the database 204.
At operation 412, the server system 200 generates, via a first Machine Learning (ML) model (e.g., the first ML model 222), a plurality of sequence-related embeddings. In an embodiment, the server system 200 generates, via the first ML model 222, the plurality of sequence-related embeddings based, at least in part, on the sequence-related features 218. In an embodiment, the server system 200 is facilitated by the generation module 232 for generating the plurality of sequence-related embeddings via the first ML model 222.
At operation 414, the server system 200 determines a relevant set of embeddings from the plurality of sequence-related embeddings. In an embodiment, the server system 200 determines, via the aggregation module 234, the relevant set of embeddings from the plurality of sequence-related embeddings such as the sequence-related embeddings 218.
In an example embodiment, it may be noted that the server system 200 determines the relevant set of embeddings for each merchant of the plurality of merchants 106 based, at least in part, on the plurality of sequence-related embeddings (e.g., the sequence-related embeddings 218). The relevant set of embeddings may indicate one or more merchant-specific embeddings corresponding to the corresponding merchant (e.g., the merchant 106(1)).
At operation 416, the server system 200 aggregates the relevant set of embeddings. In an embodiment, the server system 200 aggregates, via the aggregation module 234, the relevant set of embeddings together by applying an aggregation function on the relevant set of embeddings. In an embodiment, applying the aggregation function may correspond to computing a mean or an average of the corresponding relevant set of embeddings.
At operation 418, the server system 200 generates a merchant-specific embedding (e.g., the merchant-specific embedding 316) from the relevant set of embeddings. In an embodiment, the server system 200 generates, via the aggregation module 234, the merchant-specific embedding 316 based, at least in part, on aggregating the relevant set of embeddings for each merchant (e.g., the merchants 106).
FIG. 4B illustrates a process flow diagram 420 depicting a process of generating a set of concatenated merchant-related embeddings (e.g., the set of concatenated merchant-related embeddings 314) for each merchant 106, in accordance with an embodiment of the present disclosure. As may be understood, the server system 200 generates the set of concatenated merchant-related embeddings 314, via the concatenation module 236. For generating the set of concatenated merchant-related embeddings 314, one or more operations may be performed by the server system 200. The process starts from operation 422.
At operation 422, the server system 200 accesses the merchant-related dataset. In an embodiment, the server system 200 accesses, via the data pre-processing module 228, the merchant-related dataset from the database such as the database 204. In an embodiment, the merchant-related dataset may include historical information corresponding to various or plurality of payment transactions performed at each merchant of the plurality of merchants 106 in the payment ecosystem.
In an example, the merchant-related dataset may include merchant profile data associated with the plurality of merchants 106. In one embodiment, the merchant profile data may include data such as the merchant Doing business as (DBA) name, name of the merchant, transaction information of the plurality of cardholders 104 at a particular merchant (e.g., the merchant 106(1)), information of fraudulent or non-fraudulent transactions performed at the merchant, various terminals (e.g., Point-Of-Sale (POS) devices, Automated Teller Machines (ATMs), etc.) associated with each merchant (e.g., the merchant 106(1)), and the like.
At operation 424, the server system 200 generates the plurality of merchant velocity features. In an embodiment, the server system 200 generates, via the data pre-processing module 228, the plurality of merchant velocity features (e.g., the velocity features 318) corresponding to each merchant of the plurality of merchants 106 based, at least in part, on the merchant-related dataset.
As may be understood, in an embodiment, the velocity features 318 capture activity at a merchant (e.g., the merchant 106(1)) over a predefined interval. In an example, the predefined interval may include about 15 days snapshot interval. It may be noted that the velocity features 318 aggregate different types of transactions taking place in the predefined interval. For example, the velocity features 318 include the average amount of transactions at the merchant, maximum transaction amount, etc.
At operation 426, the server system 200 stores the plurality of merchant velocity features. In an embodiment, the server system 200 stores, via the data pre-processing module 228, the plurality of merchant velocity features such as the velocity features 318 in the database 204.
At operation 428, the server system 200 accesses the plurality of merchant velocity features. In an embodiment, the server system 200 accesses, via the generation module 232, the plurality of merchant velocity features such as the velocity features 318 from the database 204 for further processing.
At operation 430, the server system 200 generates, via a second ML model (e.g., the second ML model 224), a plurality of merchant velocity embeddings. In an embodiment, the server system 200 generates, via the second ML model 224, a plurality of merchant velocity embeddings (e.g., the merchant velocity embeddings 320) based, at least in part, on the velocity features 318 corresponding to each merchant of the plurality of merchants 106 accessed from the database 204. Each merchant velocity embedding corresponds to each payment transaction performed at the corresponding merchant 106. In an example embodiment, the server system 200 is facilitated by the generation module 232 to generate the merchant velocity embeddings 320 via the second ML model 224.
At operation 432, the server system 200 concatenates the plurality of merchant velocity embeddings (e.g., the merchant velocity embeddings 320) with each other. In an embodiment, the server system 200 concatenates, via the concatenation module 236, the merchant velocity embeddings 320 with each other. In an example embodiment, the concatenated module 234 may concatenate the merchant velocity embeddings 320 with each other using the concatenation technique.
At operation 434, the server system 200 generates a concatenated merchant velocity embedding. In an embodiment, the server system 200 generates, via the concatenation module 236, a concatenated merchant velocity embedding based, at least in part, on concatenating the merchant velocity embeddings 320 with each other.
At operation 436, the server system 200 concatenates the merchant-specific embedding (e.g., the merchant-specific embedding 316) and the concatenated merchant velocity embedding. In an embodiment, the server system 200 concatenates, via the concatenation module 236, the merchant-specific embedding 316, and the concatenated merchant velocity embedding for each merchant 106. In an embodiment, the concatenation module 236 may concatenate the merchant-specific embedding 316 and the concatenated merchant velocity embedding using the concatenation technique.
At operation 438, the server system 200 generates a set of concatenated merchant-related embeddings (e.g., the set of concatenated merchant-related embeddings 314) for each merchant 106. In an embodiment, the server system 200 generates, via the concatenation module 236, the set of concatenated merchant-related embeddings 314 based, at least in part, on concatenating the merchant-specific embedding 316 and the plurality of merchant velocity embeddings 320 associated with each merchant 106.
FIG. 4C illustrates a process flow diagram 440 depicting a process of determining a merchant behavior pattern from the set of concatenated merchant-related embeddings (e.g., the concatenated merchant-related embeddings 314), in accordance with an embodiment of the present disclosure. As may be understood, the server system 200 determines the merchant behavior pattern via the generation module 232. In an embodiment, the server system 200 is facilitated by the generation module 232 to determine the merchant behavior pattern of each merchant of the plurality of merchants 106 via the second ML model (e.g., the second ML model 224). In an example embodiment, it may be noted that the server system 200 determines, via the second ML model 224, the merchant behavior pattern corresponding to each merchant of the plurality of merchants 106 based, at least in part, on the set of concatenated merchant-related embeddings (e.g., the concatenated merchant-related embeddings 314) and a predetermined collusion threshold. For determining the merchant behavior pattern, one or more operations may be performed by the server system 200. The process starts from operation 442.
At operation 442, the server system 200 accesses the set of concatenated merchant-related embeddings (e.g., the concatenated merchant-related embeddings 314) for each merchant 106. In an embodiment, the server system 200 accesses, via the generation module 232, the concatenated merchant-related embeddings 314 from the database 204.
At operation 444, the server system 200 identifies the merchant behavior pattern corresponding to each merchant 106. In an embodiment, the server system 200 identifies, via generation module 232, the merchant behavior pattern corresponding to each merchant from the plurality of merchants 106 based, at least in part, on the set of concatenated merchant-related embeddings such as the concatenated merchant-related embeddings 314.
At operation 446, the server system 200 accesses a predicted merchant behavior pattern. In an embodiment, the server system 200 accesses, via the generation module 232, the predicted merchant behavior pattern from the database 204. As may be understood that the second ML model 224 is used for determining the merchant behavior pattern, and the second ML model 224 may be trained to learn the predicted merchant behavior pattern so that, real-time merchant behavior pattern can be compared with the predicted merchant behavior pattern to identify collusive merchants.
At operation 448, the server system 200 compares the identified merchant behavior pattern of each merchant with the predicted merchant behavior pattern. In an embodiment, the server system 200 compares, via the generation module 232, the identified merchant behavior pattern of each merchant with the predicted merchant behavior pattern.
At operation 450, the server system generates a collusion score for each merchant. In an embodiment, the server system 200 generated, via the generation module 232, the collusion score for each merchant of the plurality of merchants 106 based, at least in part, on comparing the identified merchant behavior pattern with the predicted merchant behavior pattern.
At operation 452, the server system 200 identifies the collusion score to be greater than or equal to the predetermined collusion threshold. In an embodiment, the server system 200 identifies, via the generation module 232, the collusion score to be at least equal to the predetermined collusion threshold. In an embodiment, it may be noted that the predetermined collusion threshold is determined during the training of the second ML model 224. More specifically, the predetermined collusion threshold may be obtained based on maximum F1 score determined on a validation dataset. As used herein, the term ‘F1 score’ refers to a metric used in machine learning for evaluating the performance of classification models, especially it is a harmonic mean of precision and recall. Thus, the F1 score may be computed while validating the performance of the second ML model 224 based on the validation dataset and used to set the predetermined collusion threshold, so that the second ML model 224 can perform accordingly with a better model performance.
In an embodiment, if the collusion score identified by the server system 200 is at least equal to the predetermined collusion threshold, then the process flow moves to operation 454, otherwise to operation 456.
At operation 454, the server system 200 labels the corresponding merchant as the collusive merchant. In an embodiment, the server system 200 labels, via the labeling module 238, the corresponding merchant as the collusive merchant when the collusion score is at least equal to the predetermined collusion threshold.
At operation 456, the server system 200 labels the corresponding merchant as the non-collusive merchant. In an embodiment, the server system 200 labels, via the labeling module 238, the corresponding merchant as the non-collusive merchant when the collusion score is less than the predetermined collusion threshold.
FIG. 5 illustrates a process flow diagram 500 depicting a process of labeling a future payment transaction, in accordance with an embodiment of the present disclosure. As may be understood the server system 200 labels the future payment transaction via the labeling module 238. Herein, the labeling module 238 facilitates the server system 200 to predict a label corresponding to the future payment transaction via the first ML model 222. In an example embodiment, in fraud detection, the label may include a fraudulent transaction or a non-fraudulent transaction. For predicting whether the future payment transaction is fraudulent or not, the server system 200 performs one or more operations. The process flow starts with operation 502.
At operation 502, the server system 200 predicts, via the first ML model 222, an occurrence of a future payment transaction at each merchant. In an embodiment, the generation module 232 facilitates the server system 200 to predict, via the first ML model 222, the occurrence of the future payment transaction at each merchant of the plurality of merchants 106 based, at least in part, on the plurality of sequence-related embeddings (e.g., the sequence-related embeddings 218) corresponding to each card-specific sequence. In an example embodiment, a predicted occurrence of the future payment transaction may correspond to a time of occurrence of the future payment transaction.
At operation 504, the server system 200 identifies an actual occurrence of the future payment transaction at each merchant. In an embodiment, the server system 200 identifies, via the generation module 232, the actual occurrence of the future payment transaction at each merchant of the plurality of merchants 106 based, at least in part, on historical information corresponding to payment transactions performed between the plurality of cardholders 104 and the plurality of merchants 106. The historical information may be accessed from the database 204.
At operation 506, the server system 200 determines, via the first ML model, a deviation between the actual occurrence and the predicted occurrence of the future payment transaction at each merchant. In an embodiment, the server system 200 determined, via the first ML model such as the first ML model 222, the deviation based, at least in part, on comparing the actual occurrence and the prediction of the first ML model 222. In an example embodiment, the generation module 232 facilitates the server system 200 to determine the deviation via the first ML model 222.
At operation 508, the server system 200 identifies whether the deviation is at least equal to a predefined fraudulent threshold. In an embodiment, the server system 200 identifies, via the labeling module 238, whether the deviation is at least equal to the predefined fraudulent threshold. In an example embodiment, the generation module 232 facilitates the server system 200 to identify whether the deviation is at least equal to the predefined fraudulent threshold via the first ML model 222. Herein, it may be noted that the predefined fraudulent threshold may be determined while training the first ML model 222 to learn to determine fraudulent payment transactions.
In an embodiment, it should be noted that if the server system 200 determines that the deviation is greater than or equal to the predefined fraudulent threshold, then the process flow moves to operation 510, otherwise, the process flow moves to operation 512.
At operation 510, the server system 200 labels the future payment transaction as a fraudulent transaction. In an embodiment, the server system 200 labels, via the labeling module 238, the future payment transaction as the fraudulent transaction when the deviation is identified to be greater than or equal to the predefined fraudulent threshold. In an example embodiment, the labeling module 238 facilitates the server system 200 to label the future payment transaction to be the fraudulent transaction via the first ML model 222.
At operation 512, the server system 200 labels the future payment transaction as a non-fraudulent transaction. In an embodiment, the server system 200 labels, via the labeling module 238, the future payment transaction as the non-fraudulent transaction when the deviation is identified to be less than the predefined fraudulent threshold. In an example embodiment, the labeling module 238 facilitates the server system 200 to label the future payment transaction to be the non-fraudulent transaction via the first ML model 222.
FIG. 6 illustrates a process flow diagram 600 depicting a process of training a first Machine Learning (ML) model (e.g., the first ML model 222), in accordance with an embodiment of the present disclosure. As may be understood, the first ML model 222 is used by the server system 200 to determine fraudulent payment transactions performed in a payment network (e.g., the payment network 112). During the process of determining the fraudulent payment transactions, sequence-related embeddings (e.g., the sequence-related embedding 218) may be generated. Herein, it should be noted that the sequence-related embedding 218 corresponds to desired embeddings related to the cardholders 104 that may be used by the server system 200 for generating a merchant-specific embedding (e.g., the merchant-specific embedding 316). The merchant-specific embedding 316 for each merchant is further used for merchant monitoring or fraud detection in the merchants 106.
In an embodiment, it may also be noted that the server system 200 generates the first ML model 222 via the training module 230. For generating the first ML model 222, the server system 200 may perform a first set of operations iteratively until the performance of the first ML model 222 converges to first predefined criteria. Moreover, in non-limiting exemplary implementations, the first ML model 222 may be a deviation-based Marked Temporal Point Process (MTPP) model.
As used herein, the term ‘deviation-based MTPP model’ refers to a machine learning model used to analyze event sequences where each event is associated with a label (or marker) and focuses on deviation from predicted patterns. Using the deviation-based MTPP model, false positives may be reduced.
Further, in one specific embodiment, the deviation-based MTPP model may be built effectively using a Long short-term memory (LSTM) model or a Recurrent Neural Network (RNN) model. As used herein, the term ‘RNN’ refers to a class of a neural network that is helpful in modeling sequence data. It may be noted that RNN models produce predictive results in sequential data. Further, as used herein, the term ‘LSTM model’ refers to a variation of a Recurrent Neural Network (RNN) model that is specifically designed to capture long-term dependencies in sequential data such as, time series, speech, and text.
In an example embodiment, the deviation-based MTPP model may be built effectively by integrating a feed-forward (FF) network (as shown in FIG. 3) (e.g., the FF network 304) to the LSTM model or the RNN model forming a hybrid architecture. In such an embodiment, it may be noted that the FF network may be integrated into the LSTM model or a similar RNN architecture, in case, input data involves a combination of sequential and static features. Herein, the LSTM model processes the sequential data and captures temporal patterns. The FF network 304 can process static features or provide additional layers for modeling complex relationships. The LSTM model’s outputs or hidden states can be passed through the FF network 304 for further processing before calculating deviations and making predictions. For example, in medical diagnosis, time series patient data (e.g., vital signs) and static patient information (e.g., age, gender, etc.) may be available. The LSTM model can handle the temporal data, while the FF network processes the static features, and their outputs are combined to predict a patient’s health outcome.
Further, in an embodiment, the deviation-based MTPP model may also include an FF-deviation network 306. Herein, the FF-deviation network 306 may be used by the server system 200 to compute deviations in data provided to the first ML model 222.
For example, the deviation-based MTPP model receives sequence-related features (e.g., the sequence-related features 218) as an input. As may be understood, the sequence-related features correspond to the one or more card-specific sequences of payment transactions performed by the cardholders 104 at the merchants 106. Herein, each card-specific sequence is denoted by its time of occurrence and corresponding labels.
Further, mathematically, each card-specific sequence is represented by S = {(t1, y1), (t2, y2), ..., (tn, yn)}, wherein ‘n’ refers to the total sequence length. Further, (tj, yj) refers to the jth event represented by the time of the event (tj) and the corresponding label (yj). By default, the events are ordered in time, such that tj+1 = tj. Given the card-specific sequence of last ‘n’ events, often the task is to predict the next event time tn+1 and the corresponding label yn+1. In a non-limiting example, the value of ‘n’ is fixed and equals 9 in the current pipeline, i.e., for each payment card of the cardholder, the first ML model 222 ingests the last 9 transactions and predicts time and label for the 10th transaction.
Further, for each event in the card-specific sequence, multiple labels (say, ‘m’ labels per event) can be used as inputs. In this case, individual encodings of each label are concatenated into one vector. In a non-limiting example, the concatenated embeddings denoted by yc are calculated by concatenating all the ‘m’ embeddings as:
y_ci=[y_1 |y_2 | y_3|,…y_m ] … Eqn. 1
This alters the input to the subsequent LSTM layer to:
h_j=ReLU?(W^y y_cj^em+W^d d_j+W^h h_(j-1)+b_h ) … Eqn. 2.

Herein, the terms ‘W^y’, ‘W^d’, ‘W^h’, and ‘b_h’ refer to a marker weight matrix, a time duration weight matrix, RNN’s or LSTM’s historical representation weight matrix, and a bias weight vector respectively. Further, the term ‘d_j’ refers to an inter-event time and the term ‘h’ refers to RNN’s or LSTM’s hidden states.
To that note, the first set of operations may include initializing the first ML model 222 based, at least in part, on one or more first model parameters (e.g., the first model parameters 602) (see, 604). In an embodiment, the first model parameters 602 may include at least: a predefined temporal pattern for an event sequence, an intensity function, a deviation threshold, a labeled event distribution, and the like.
The first set of operations may further include detecting, via the first ML model 222, an event based, at least in part, on the plurality of sequence-related embeddings (e.g., the sequence-related embeddings 608) (see, 606). Herein, it may be noted that the sequence-related embeddings 608 are substantially similar to the sequence-related embeddings 218 as shown in FIG. 3. For example, in case of fraud detection in the payment network 112, the server system 200 may be configured to detect, via the first ML model 222, an event such as the occurrence of the future payment transaction based, at least in part, on the plurality of sequence-related embeddings (e.g., the sequence-related embeddings 608).
Further, the server system 200 may be configured to compute via the first ML model 222, a deviation between the predicted occurrence and an actual occurrence of the future payment transaction (e.g., the predicted event pattern 610) (see, 612). Herein, in an embodiment, the deviation may be computed by comparing the predicted occurrence with an actual occurrence of the future payment transaction. For example, in the case of fraud detection, suppose a future payment transaction is detected after two days, however, it was predicted after four days, then deviation here is preponed by 2 days.
The server system 200 may further be configured to optimize the first model parameters 602 associated with the first ML model 222 based, at least in part, on the deviation (see, 614). Upon optimizing the first model parameters 602, the server system 200 may be predicted to update the predicted occurrence of the future payment transaction based on the optimization.
In an embodiment, it may be noted that upon optimization the server system 200 may further be configured to determine if the performance of the first ML model 224 is converged to the first predefined criteria (see, 616). If the performance is determined to be converged, then the process of training the first ML model 224 stops (see, 618), otherwise the first set of operations repeats until the convergence of the first ML model 224.
In some embodiments, the first predefined criteria for the convergence of the first ML model may be dependent on one or more factors such as a number of iterations, a minimum improvement in loss, validation set performance, early stopping criteria, and the like. Herein, in an embodiment, the factor of the number of iterations may refer to a criterion in which a maximum number of iterations is fixed for the training process and once the model reaches that number, the training process stops. In another embodiment, the factor of the minimum improvement in loss may refer to a criterion in which a minimum improvement in the loss function in consecutive iterations is observed and if the improvement falls below a predefined threshold the training process stops.
In yet another embodiment, the factor of validation set performance may refer to monitoring a performance of the first ML model 224 on a validation set at each iteration and if the performance of the model on the validation set does not improve or starts to degrade, then the training process stops. Further, in an embodiment, the factor of early stopping criteria may refer to a criterion that may be chosen to prevent overfitting. It involves dividing the training data into training and validation sets. The training process continues until the performance on the validation set stops improving, and then the first ML model 224 with the best performance on the validation set is selected as the final model.
FIG. 7 illustrates a process flow diagram 700 depicting a process of training a second ML model (e.g., the second ML model 224), in accordance with an embodiment of the present disclosure. As may be understood, the second ML model 224 is used by the server system 200 to determine collusive merchants present in a payment network (e.g., the payment network 112). During the process of determining the collusive merchants, a merchant-specific embedding (e.g., the merchant-specific embedding 316) for each merchant, which is obtained from the first ML model 222 is used. Along with this, merchant velocity embeddings such as the merchant velocity embeddings 320 are also used. Herein, it may be noted that a set of concatenated merchant-related embeddings (e.g., the concatenated merchant-related embeddings 322) may be obtained from the merchant-specific embedding 316 and the merchant velocity embeddings 320 upon concatenation. Further, the server system 200 determines the merchant behavior pattern based on the processing of the concatenated merchant-related embeddings 322 using the second ML model 224.
In an embodiment, it may be noted that the server system 200 generates the second ML model 224 via the training module 230. For generating the second ML model 224, the server system 200 may perform a second set of operations iteratively until the performance of the second ML model 224 converges to second predefined criteria. Moreover, in some embodiments, the second ML model 224 may be a Gradient Boosting Machine (GBM) model.
As used herein, the term ‘Gradient Boosting Machine model’ refers to a model that involves building an ensemble of weak learners (typically decision trees) sequentially, where each new learner corrects the errors made by the previous ones. It may be noted that GBM models can be trained to analyze historical merchant data and detect anomalies in transaction patterns, such as unusual transaction volumes or unexpected revenue fluctuations. They can flag merchants whose behavior deviates significantly from normal behavior.
In a specific embodiment, the second set of operations may start from operation 702.
At operation 702, the server system 200 may perform initializing the second ML model 224 based, at least in part, on one or more second model parameters and an initial prediction. In a non-limiting example, the one or more second model parameters may include a number of trees, a learning rate, a tree depth, minimum samples per leaf, minimum samples per split, maximum features, loss function, and the like. Herein, the initial prediction may include a simple prediction such as a mean of the target values for regression tasks, or the most frequent class for classification tasks.
At operation 704, the server system 200 may perform generating, via the second ML model 224, a decision tree for generating predictions based, at least in part on the set of concatenated merchant-related embeddings (e.g., the concatenated merchant-related embeddings 322). In a non-limiting implementation, the decision tree may correspond to a classification and regression tree. It may be noted that the decision tree may be generated based on a training dataset extracted from the merchant-related dataset. Thus, it may be understood that the concatenated merchant-related embeddings 322 may be generated from the training dataset.
At operation 706, the server system 200 may perform computing one or more optimization parameters based at least on a performance of the decision tree, using one or more optimization functions. Herein, it may be noted that each of the one or more optimization parameters may be computed using their respective optimization functions. More specifically, it may be noted that the gradient indicates how much the predictions need to be updated to reduce the overall loss. Similarly, hessian is used to adjust the step size (learning rate) during the optimization process.
At operation 708, the server system 200 may perform building an updated decision tree for reducing the one or more optimization parameters. Herein, it may be noted that the updated decision tree may reduce the one or more optimization parameters by capturing patterns and relationships that were not well-modeled by the previous trees.
At operation 710, the server system 200 may perform assigning weights to the updated decision tree and the previously generated decision tree based, at least in part, on a performance of the updated decision tree and the previously generated decision tree in reducing the one or more optimization parameters. For example, decision trees that contribute more to the model’s predictive performance receive higher weights.
At operation 712, the server system 200 may perform generating an ensemble of a plurality of decision trees by adding predictions of the updated decision tree to predictions of the previously generated decision tree.
At operation 714, the server system 200 may optimize the one or more second model parameters.
At operation 716, the server system 200 may check whether the performance of the second ML model 224 is converged to the second predefined criteria. In case the second ML model converges to the second predefined criteria, the training of the second ML model stops (see, 718), otherwise, the second set of operations in the training of the second ML model 224 would be repeated as shown in FIG. 7.
In some embodiments, the second predefined criteria for the convergence of the second ML model 224 may be dependent on one or more factors such as a number of iterations or trees, a minimum improvement in loss, validation set performance, and early stopping criteria. Herein, in an embodiment, the factor of the number of iterations or trees may refer to a criterion in which a maximum number of iterations is fixed for the training process and once the model reaches that number, the training process stops. In another embodiment, the factor of the minimum improvement in loss may refer to a criterion in which a minimum improvement in the loss function in consecutive iterations is observed and if the improvement falls below a predefined threshold the training process stops.
In yet another embodiment, the factor of validation set performance may refer to monitoring a performance of the model on a validation set at each iteration and if the performance of the model on the validation set does not improve or starts to degrade, then the training process stops. Further, in an embodiment, the factor of early stopping criteria may refer to a criterion that may be chosen to prevent overfitting. It involves dividing the training data into training and validation sets. The training process continues until the performance on the validation set stops improving, and then the model with the best performance on the validation set is selected as the final model.
In an embodiment, basic elements of the GBM model include a loss function, weak learners, and an additive model. In an embodiment, it may be noted that based on a type of response variable ‘y’, the loss function may be classified into different types. In an example embodiment, for a continuous response, the loss function may include a Gaussian L2 loss function, a Laplace L2 loss function, a Huber loss function with d specified, and a Quantile loss function, a specified. In another example embodiment, for a categorical response, the loss function may include a Binomial loss function and an Adaboost loss function. In yet another example embodiment, for other families of responses, the loss function may include loss functions for survival models, count data, and custom loss functions.
Further, the term ‘weak learner’ refers to base learner models that learn from past errors and help in building a strong predictive model design for boosting algorithms in machine learning. Generally, decision trees work as weak learners in boosting algorithms. Furthermore, the ‘additive model’ refers to a model of adding trees to the model. It may be noted that in the case of the GBM model, a single tree is added at a time to the existing tree in the model. Moreover, every time a tree is added, the error of the model is reduced and improves the performance of the model. In a non-limiting example, the additive model of ‘b’ additive trees may be represented by the following equation:
f(x)=B?¦?b=1fb(x) ? … Eqn. 3
FIG. 8 illustrates a flow diagram depicting a method 800 for generating merchant-related embeddings (e.g., the concatenated merchant-related embeddings 316) for learning merchant behavior, in accordance with an embodiment of the present disclosure. The method 800 depicted in the flow diagram may be executed by, for example, the server system 200. The sequence of operations of the method 800 may not be necessarily executed in the same order as they are presented. Further, one or more operations may be grouped and performed in the form of a single step, or one operation may have several sub-steps that may be performed in parallel or in a sequential manner. Operations of the method 800, and combinations of operations in the method 800 may be implemented by, for example, hardware, firmware, a processor, circuitry, and/or a different device associated with the execution of software that includes one or more computer program instructions. The plurality of operations is depicted in the process flow of the method 800. The process flow starts at operation 802.
At 802, the method 800 includes accessing, by a server system (e.g., the server system 200), a plurality of sequence-related features (e.g., the sequence-related features 218) from a database (e.g., the database 204) associated with the server system 200. The sequence-related features 218 correspond to one or more card-specific sequences of payment transactions performed by a plurality of cardholders (e.g., the cardholders 104) at a plurality of merchants (e.g., the merchants 106). Herein, each card-specific sequence of the one or more card-specific sequences corresponds to a particular cardholder from the plurality of cardholders 104.
At 804, the method 800 includes generating, by the server system 200 via a first Machine Learning (ML) model (e.g., the first ML model 222), a plurality of sequence-related embeddings (e.g., the sequence-related embeddings 312) based, at least in part, on the sequence-related features 218. Herein, each sequence-related embedding corresponds to each card-specific sequence (e.g., the card-specific sequences 302).
At 806, the method 800 includes generating, by the server system 200, a merchant-specific embedding (e.g., the merchant-specific embedding 314) for each merchant of the plurality of merchants 106 based, at least in part, on aggregating a relevant set of embeddings of the plurality of sequence-related embeddings 312.
At 808, the method 800 includes generating, by the server system 200 via a second ML model (e.g., the second ML model 224), a plurality of merchant velocity embeddings (e.g., the merchant velocity embeddings 320) based, at least in part, on a plurality of merchant velocity features (e.g., the velocity features 320) corresponding to each merchant accessed from the database 204. Herein, each merchant velocity embedding 318 corresponds to each payment transaction performed at the corresponding merchant (e.g., the merchant 106(1)).
At 810, the method 800 includes generating, by the server system 200, a set of concatenated merchant-related embeddings (e.g., the concatenated merchant-related embeddings 316) for each merchant of the plurality of merchants 106 based, at least in part, on concatenating the merchant-specific embedding 314 with the plurality of merchant velocity embeddings 320 associated with each merchant 106; and
At 812, the method 800 includes determining, by the server system 200 via the second ML model 224, a merchant behavior pattern corresponding to each merchant based, at least in part, on the set of concatenated merchant-related embeddings such as the concatenated merchant-related embeddings 316 and a predetermined collusion threshold.
FIG. 9 illustrates a simplified block diagram of the payment server 900, in accordance with an embodiment of the present disclosure. The payment server 900 is an example of the payment server 114 of FIG. 1. The payment server 900 and the server system 200 may use the payment network 112 as a payment interchange network. Examples of payment interchange networks include, but are not limited to, Mastercard® payment system interchange network.
The payment server 900 includes a processing module 902 configured to extract programming instructions from a memory 904 to provide various features of the present disclosure. The components of the payment server 900 provided herein may not be exhaustive, and the payment server 900 may include more or fewer components than that depicted in FIG. 9. Further, two or more components may be embodied in one single component, and/or one component may be configured using multiple sub-components to achieve the desired functionalities. Some components of the payment server 900 may be configured using hardware elements, software elements, firmware elements, and/or a combination thereof.
Via a communication module 906, the processing module 902 receives a request from a remote device 908, such as the issuer servers 108, the acquirer servers 110, or the server system 102. The request may be a request for conducting the payment transaction. The communication may be achieved through API calls, without loss of generality. The payment server 900 includes a database 910. The database 910 also includes transaction processing data such as issuer ID, country code, acquirer ID, and merchant identifier (MID), among others.
When the payment server 900 receives a payment transaction request from the acquirer servers 110 or a payment terminal (e.g., IoT device), the payment server 900 may route the payment transaction request to the issuer servers 108. The database 910 stores transaction identifiers for identifying transaction details such as transaction amount, IoT device details, acquirer account information, transaction records, merchant account information, and the like.
In one example embodiment, the acquirer servers 110 is configured to send an authorization request message to the payment server 900. The authorization request message includes, but is not limited to, the payment transaction request.
The processing module 902 further sends the payment transaction request to the issuer servers 108 for facilitating the payment transactions from the remote device 908. The processing module 902 is further configured to notify the remote device 908 of the transaction status in the form of an authorization response message via the communication module 906. The authorization response message includes, but is not limited to, a payment transaction response received from the issuer servers 108. Alternatively, in one embodiment, the processing module 902 is configured to send an authorization response message for declining the payment transaction request, via the communication module 906, to the acquirer servers 110. In one embodiment, the processing module 902 executes similar operations performed by the server system 200, however, for the sake of brevity, these operations are not explained herein.
The disclosed method with reference to FIG. 9, or one or more operations of the server system 200 may be implemented using software including computer-executable instructions stored on one or more computer-readable media (e.g., non-transitory computer-readable media, such as one or more optical media discs, volatile memory components (e.g., DRAM or SRAM), or nonvolatile memory or storage components (e.g., hard drives or solid-state nonvolatile memory components, such as Flash memory components) and executed on a computer (e.g., any suitable computer, such as a laptop computer, netbook, Web book, tablet computing device, smartphone, or other mobile computing devices). Such software may be executed, for example, on a single local computer or in a network environment (e.g., via the Internet, a wide-area network, a local-area network, a remote web-based server, a client-server network (such as a cloud computing network), or other such networks) using one or more network computers. Additionally, any of the intermediate or final data created and used during the implementation of the disclosed methods or systems may also be stored on one or more computer-readable media (e.g., non-transitory computer-readable media) and are considered to be within the scope of the disclosed technology. Furthermore, any of the software-based embodiments may be uploaded, downloaded, or remotely accessed through a suitable communication means. Such a suitable communication means include, for example, the Internet, the World Wide Web, an intranet, software applications, cable (including fiber optic cable), magnetic communications, electromagnetic communications (including RF, microwave, and infrared communications), electronic communications, or other such communication means.
Although the invention has been described with reference to specific exemplary embodiments, it is noted that various modifications and changes may be made to these embodiments without departing from the broad scope of the invention. For example, the various operations, blocks, etc., described herein may be enabled and operated using hardware circuitry (for example, Complementary Metal Oxide Semiconductor (CMOS) based logic circuitry), firmware, software, and/or any combination of hardware, firmware, and/or software (for example, embodied in a machine-readable medium). For example, the apparatuses and methods may be embodied using transistors, logic gates, and electrical circuits (for example, Application-Specific Integrated Circuit (ASIC) circuitry and/or in Digital Signal Processor (DSP) circuitry).
Particularly, the server system 200 and its various components may be enabled using software and/or using transistors, logic gates, and electrical circuits (for example, integrated circuit circuitry such as ASIC circuitry). Various embodiments of the invention may include one or more computer programs stored or otherwise embodied on a computer-readable medium, wherein the computer programs are configured to cause a processor or the computer to perform one or more operations. A computer-readable medium storing, embodying, or encoded with a computer program, or similar language, may be embodied as a tangible data storage device storing one or more software programs that are configured to cause a processor or computer to perform one or more operations. Such operations may be, for example, any of the steps or operations described herein. In some embodiments, the computer programs may be stored and provided to a computer using any type of non-transitory computer-readable media. Non-transitory computer-readable media includes any type of tangible storage media. Examples of non-transitory computer-readable media include magnetic storage media (such as floppy disks, magnetic tapes, hard disk drives, etc.), optical magnetic storage media (e.g. magneto-optical disks), Compact Disc Read-Only Memory (CD-ROM) , Compact Disc Recordable CD-R, Compact Disc Rewritable CD-R/W), Digital Versatile Disc (DVD), BLU-RAY® Disc ( BD), and semiconductor memories (such as mask ROM, programmable ROM (PROM ), Erasable PROM (EPROM ), flash memory, Random Access Memory (RAM), etc.). Additionally, a tangible data storage device may be embodied as one or more volatile memory devices, one or more non-volatile memory devices, and/or a combination of one or more volatile memory devices and non-volatile memory devices. In some embodiments, the computer programs may be provided to a computer using any type of transitory computer-readable media. Examples of transitory computer-readable media include electric signals, optical signals, and electromagnetic waves. Transitory computer-readable media can provide the program to a computer via a wired communication line (e.g., electric wires, and optical fibers) or a wireless communication line.
Various embodiments of the invention, as discussed above, may be practiced with steps and/or operations in a different order, and/or with hardware elements in configurations, which are different from those which, are disclosed. Therefore, although the invention has been described based on these exemplary embodiments, it is noted that certain modifications, variations, and alternative constructions may be apparent and well within the scope of the invention.
Although various exemplary embodiments of the invention are described herein in a language specific to structural features and/or methodological acts, the subject matter defined in the appended claims is not necessarily limited to the specific features or acts described above. Rather, the specific features and acts described above are disclosed as exemplary forms of implementing the claims.
, Claims:1. A computer-implemented method, comprising:
accessing, by a server system, a plurality of sequence-related features from a database associated with the server system, the plurality of sequence-related features corresponding to one or more card-specific sequences of payment transactions performed by a plurality of cardholders at a plurality of merchants, each card-specific sequence of the one or more card-specific sequences corresponds to a particular cardholder from the plurality of cardholders;
generating, by the server system via a first Machine Learning (ML) model, a plurality of sequence-related embeddings based, at least in part, on the plurality of sequence-related features, each sequence-related embedding of the plurality of sequence-related embeddings corresponds to each card-specific sequence;
generating, by the server system, a merchant-specific embedding for each merchant of the plurality of merchants based, at least in part, on aggregating a relevant set of embeddings of the plurality of sequence-related embeddings;
generating, by the server system via a second ML model, a plurality of merchant velocity embeddings based, at least in part, on a plurality of merchant velocity features corresponding to each merchant accessed from the database, each merchant velocity embedding corresponds to each payment transaction performed at the corresponding merchant;
generating, by the server system, a set of concatenated merchant-related embeddings for each merchant based, at least in part, on concatenating the merchant-specific embedding with the plurality of merchant velocity embeddings associated with each merchant; and
determining, by the server system via the second ML model, a merchant behavior pattern corresponding to each merchant based, at least in part, on the set of concatenated merchant-related embeddings and a predetermined collusion threshold.
2. The computer-implemented method as claimed in claim 1, further comprising:
accessing, by the server system, a cardholder-related dataset from the database, the cardholder-related dataset comprising historical information corresponding to payment transactions performed by the plurality of cardholders at the plurality of merchants using a plurality of payment cards;
generating, by the server system, transaction sequence data corresponding to each cardholder based, at least in part, on the cardholder-related dataset and a time stamp associated with each payment transaction, the transaction sequence data comprising the one or more card-specific sequences of payment transactions corresponding to each cardholder, each card-specific sequence corresponds to a sequence of payment transactions linked with a time of occurrence and a predetermined label and performed by a particular cardholder at the plurality of merchants;
generating, by the server system, a plurality of sequence-related features corresponding to each cardholder for each card-specific sequence based, at least in part, on the transaction sequence data; and
storing, by the server system, the plurality of sequence-related features for each cardholder and for each card-specific sequence in the database.
3. The computer-implemented method as claimed in claim 1, wherein generating the merchant-specific embedding for each merchant comprising:
determining, by the server system, the relevant set of embeddings for each merchant based, at least in part, on the plurality of sequence-related embeddings, the relevant set of embeddings indicating one or more sequence-related embeddings of the plurality of sequence-related embeddings corresponding to the corresponding merchant,
aggregating, by the server system, the relevant set of embeddings for each merchant using a predefined aggregation technique, and
generating, by the server system, the merchant-specific embedding for each merchant based, at least in part, on aggregating the relevant set of embeddings for each merchant.
4. The computer-implemented method as claimed in claim 1, further comprising:
accessing, by the server system, a merchant-related dataset from the database, the merchant-related dataset comprising historical information corresponding to payment transactions performed at each merchant of the plurality of merchants;
generating, by the server system, a plurality of merchant velocity features corresponding to each merchant of the plurality of merchants based, at least in part, on the merchant-related dataset; and
storing, by the server system, the plurality of merchant velocity features for each merchant in the database.
5. The computer-implemented method as claimed in claim 1, wherein determining, via the second ML model, the merchant behavior pattern comprising:
identifying, by the server system, the merchant behavior pattern corresponding to each merchant based, at least in part, on the set of concatenated merchant-related embeddings,
generating and assigning a collusion score to at least one merchant of the plurality of merchants based, at least in part, on comparing the identified merchant behavior pattern with a predicted merchant behavior pattern, and
performing, by the server system, one of:
labeling the at least one merchant as a collusive merchant based, at least in part, the collusion score being at least equal to the predetermined collusion threshold, and
labeling the at least one merchant of the plurality of merchants as a non-collusive merchant based, at least in part, the collusion score being less than the predetermined collusion threshold.
6. The computer-implemented method as claimed in claim 1, wherein the first ML model comprises a deviation-based Marked Temporal Point Process (MTPP) model and the second ML model comprises a Gradient Boosting Machine (GBM) model.
7. The computer-implemented method as claimed in claim 1, further comprising:
predicting, by the server system via the first ML model, an occurrence of a future payment transaction at each merchant based, at least in part, on the plurality of sequence-related embeddings corresponding to each card-specific sequence;
identifying, by the server system, an actual occurrence of the future payment transaction at each merchant based, at least in part, on historical information corresponding to payment transactions performed between the plurality of cardholders and the plurality of merchants, the historical information accessed from the database;
determining, by the server system via the first ML model, a deviation between the actual occurrence and a predicted occurrence of the future payment transaction at each merchant of the plurality of merchants based, at least in part, on comparing the actual occurrence and the prediction of the first ML model; and
performing, by the server system via the first ML model, one of:
labeling the future payment transaction as a fraudulent transaction based, at least in part, on the deviation being at least equal to a predefined fraudulent threshold, and
labeling the future payment transaction as a non-fraudulent transaction based, at least in part, on the deviation being less than the predefined fraudulent threshold.
8. The computer-implemented method as claimed in claim 1, wherein the plurality of sequence-related features comprises at least: time stamps of payment transactions, a location flag, a payment card status flag, a transaction status flag, a sequence type, and label information.
9. The computer-implemented method as claimed in claim 1, further comprising:
generating, by the server system, the first ML model, wherein generating the first ML model comprises performing a first set of operations iteratively until the performance of the first ML model converges to first predefined criteria, the first set of operations comprising:
initializing the first ML model based, at least in part, on one or more first model parameters;
predicting, via the first ML model, an occurrence of a future payment transaction based, at least in part, on the plurality of sequence-related embeddings;
computing, via the first ML model, a deviation between the predicted occurrence and an actual occurrence of the future payment transaction; and
optimizing the one or more first model parameters associated with the first ML model based, at least in part, on the deviation.
10. The computer-implemented method as claimed in claim 1, further comprising:
generating, by the server system, the second ML model, wherein generating the second ML model comprises performing a second set of operations iteratively until the performance of the second ML model converges to second predefined criteria, the second set of operations comprising:
initializing the second ML model based, at least in part, on one or more second model parameters and an initial prediction;
generating, via the second ML model, a decision tree for generating predictions based, at least in part on the set of concatenated merchant-related embeddings;
computing, via the second ML model, one or more optimization parameters based at least on a performance of the decision tree, using one or more optimization functions;
building, via the second ML model, an updated decision tree for reducing the one or more optimization parameters;
assigning, via the second ML model, weights to the updated decision tree and the previously generated decision tree based, at least in part, on a performance of the updated decision tree and the previously generated decision tree in reducing the one or more optimization parameters; and
generating, via the second ML model, an ensemble of a plurality of decision trees by adding predictions of the updated decision tree to predictions of the previously generated decision tree and optimizing the one or more second model parameters.
11. The computer-implemented method as claimed in claim 1, wherein the server system is a payment server associated with a payment network.
12. A server system, comprising:
a communication interface;
a memory comprising executable instructions; and
a processor communicably coupled to the communication interface and the memory, the processor configured to cause the server system to at least:
access a plurality of sequence-related features from a database associated with the server system, the plurality of sequence-related features corresponding to one or more card-specific sequences of payment transactions performed by a plurality of cardholders at a plurality of merchants, each card-specific sequence of the one or more card-specific sequences corresponds to a particular cardholder from the plurality of cardholders;
generate via a first Machine Learning (ML) model, a plurality of sequence-related embeddings based, at least in part, on the plurality of sequence-related features, each sequence-related embedding of the plurality of sequence-related embeddings corresponds to each card-specific sequence;
generate a merchant-specific embedding for each merchant of the plurality of merchants based, at least in part, on aggregating a relevant set of embeddings of the plurality of sequence-related embeddings;
generate, via a second ML model, a plurality of merchant velocity embeddings based, at least in part, on a plurality of merchant velocity features corresponding to each merchant accessed from the database, each merchant velocity embedding corresponds to each payment transaction performed at the corresponding merchant;
generate a set of concatenated merchant-related embeddings for each merchant based, at least in part, on concatenating the merchant-specific embedding and the plurality of merchant velocity embeddings associated with each merchant; and
determine via the second ML model, a merchant behavior pattern corresponding to each merchant based, at least in part, on the set of concatenated merchant-related embeddings and a predetermined collusion threshold.
13. The server system as claimed in claim 12, wherein the server system is further caused at least to:
access a cardholder-related dataset from the database, the cardholder-related dataset comprising historical information corresponding to payment transactions performed by the plurality of cardholders at the plurality of merchants;
generate transaction sequence data corresponding to each cardholder based, at least in part, on the cardholder-related dataset and a time stamp associated with each payment transaction, the transaction sequence data comprising the one or more card-specific sequences of payment transactions corresponding to each cardholder, each card-specific sequence corresponds to a sequence of payment transactions linked with a time of occurrence and a predetermined label and performed by a particular cardholder at the plurality of merchants;
generate a plurality of sequence-related features corresponding to each cardholder for each card-specific sequence based, at least in part, on the transaction sequence data; and
store the plurality of sequence-related features for each cardholder and for each card-specific sequence in the database.
14. The server system as claimed in claim 12, wherein to generate the merchant-specific embedding for each merchant, the server system is further caused at least to:
determine the relevant set of embeddings for each merchant based, at least in part, on the plurality of sequence-related embeddings, the relevant set of embeddings indicating one or more sequence-related embeddings of the plurality of sequence-related embeddings corresponding to the corresponding merchant;
aggregate the relevant set of embeddings for each merchant using a predefined aggregation technique; and
generate the merchant-specific embedding for each merchant based, at least in part, on aggregating the relevant set of embeddings for each merchant.
15. The server system as claimed in claim 12, wherein the server system is further caused at least to:
access a merchant-related dataset from the database, the merchant-related dataset comprising historical information corresponding to payment transactions performed at each merchant of the plurality of merchants;
generate a plurality of merchant velocity features corresponding to each merchant of the plurality of merchants based, at least in part, on the merchant-related dataset; and
store the plurality of merchant velocity features for each merchant in the database.
16. The server system as claimed in claim 12, wherein for the server system to determine, via the second ML model, the merchant behavior pattern, the server system is further caused at least to:
identify the merchant behavior pattern corresponding to each merchant of the plurality of merchants based, at least in part, on the set of concatenated merchant-related embeddings;
generate and assign a collusion score to at least one merchant of the plurality of merchants based, at least in part, on comparing the identified merchant behavior pattern with a predicted merchant behavior pattern; and
perform one of:
label the at least one merchant of the plurality of merchants as a collusive merchant based, at least in part, the collusion score being at least equal to the predetermined collusion threshold, and
label the at least one merchant of the plurality of merchants as a non-collusive merchant based, at least in part, the collusion score being less than the predetermined collusion threshold.
17. The server system as claimed in claim 12, wherein the server system is further caused at least to:
predict, via the first ML model, an occurrence of a future payment transaction at each merchant of the plurality of merchants based, at least in part, on the plurality of sequence-related embeddings corresponding to each card-specific sequence;
identify an actual occurrence of the future payment transaction at each merchant of the plurality of merchants based, at least in part, on historical information corresponding to payment transactions performed between the plurality of cardholders and the plurality of merchants, the historical information accessed from the database;
determine, via the first ML model, a deviation between the actual occurrence and a predicted occurrence of the future payment transaction at each merchant of the plurality of merchants based, at least in part, on comparing the actual occurrence and the prediction of the first ML model; and
perform, via the first ML model, one of:
label the future payment transaction as a fraudulent transaction based, at least in part, on the deviation being at least equal to a predefined fraudulent threshold, and
label the future payment transaction as a non-fraudulent transaction based, at least in part, on the deviation being less than the predefined fraudulent threshold
18. The server system as claimed in claim 12, wherein the server system is further caused at least to generate the first ML model, wherein to generate the first ML model, the server system is further caused at least to perform a first set of operations iteratively until the performance of the first ML model converges to first predefined criteria, the first set of operations comprising:
initializing the first ML model based, at least in part, on one or more first model parameters;
predicting, via the first ML model, an occurrence of the future payment transaction based, at least in part, on the plurality of sequence-related embeddings;
computing, via the first ML model, a deviation between the predicted occurrence and an actual occurrence of the future payment transaction; and
optimizing the one or more first model parameters associated with the first ML model based, at least in part, on the deviation.
19. The server system as claimed in claim 12, wherein the server system is further caused at least to generate the second ML model, wherein to generate the second ML model, the server system is further caused at least to perform a second set of operations iteratively until the performance of the second ML model converges to second predefined criteria, the second set of operations comprising:
initializing the second ML model based, at least in part, on one or more second model parameters and an initial prediction;
generating, via the second ML model, a decision tree for generating predictions based, at least in part on the set of concatenated merchant-related embeddings;
computing, via the second ML model, one or more optimization parameters based at least on a performance of the decision tree, using one or more optimization functions;
building, via the second ML model, an updated decision tree for reducing the one or more optimization parameters;
assigning, via the second ML model, weights to the updated decision tree and the previously generated decision tree based, at least in part, on a performance of the updated decision tree and the previously generated decision tree in reducing the one or more optimization parameters; and
generating, via the second ML model, an ensemble of a plurality of decision trees by adding predictions of the updated decision tree to predictions of the previously generated decision tree and optimizing the one or more second model parameters.
20. A non-transitory computer-readable storage medium comprising computer-executable instructions that, when executed by at least a processor of a server system, cause the server system to perform a method comprising:
accessing a plurality of sequence-related features from a database associated with the server system, the plurality of sequence-related features corresponding to one or more card-specific sequences of payment transactions performed by a plurality of cardholders at a plurality of merchants, each card-specific sequence of the one or more card-specific sequences corresponds to a particular cardholder from the plurality of cardholders;
generating, via a first Machine Learning (ML) model, a plurality of sequence-related embeddings based, at least in part, on the plurality of sequence-related features, each sequence-related embedding of the plurality of sequence-related embeddings corresponds to each card-specific sequence;
generating a merchant-specific embedding for each merchant of the plurality of merchants based, at least in part, on aggregating a relevant set of embeddings of the plurality of sequence-related embeddings;
generating, via a second ML model, a plurality of merchant velocity embeddings based, at least in part, on a plurality of merchant velocity features corresponding to each merchant accessed from the database, each merchant velocity embedding corresponds to each payment transaction performed at the corresponding merchant;
generating a set of concatenated merchant-related embeddings for each merchant based, at least in part, on concatenating the merchant-specific embedding and the plurality of merchant velocity embeddings associated with each merchant; and
determining, via the second ML model, a merchant behavior pattern corresponding to each merchant based, at least in part, on the set of concatenated merchant-related embeddings and a predetermined collusion threshold.

Documents

Application Documents

#	Name	Date
1	202441007067-STATEMENT OF UNDERTAKING (FORM 3) [02-02-2024(online)].pdf	2024-02-02
2	202441007067-POWER OF AUTHORITY [02-02-2024(online)].pdf	2024-02-02
3	202441007067-FORM 1 [02-02-2024(online)].pdf	2024-02-02
4	202441007067-FIGURE OF ABSTRACT [02-02-2024(online)].pdf	2024-02-02
5	202441007067-DRAWINGS [02-02-2024(online)].pdf	2024-02-02
6	202441007067-DECLARATION OF INVENTORSHIP (FORM 5) [02-02-2024(online)].pdf	2024-02-02
7	202441007067-COMPLETE SPECIFICATION [02-02-2024(online)].pdf	2024-02-02
8	202441007067-Proof of Right [08-04-2024(online)].pdf	2024-04-08