Abstract: Embodiments provide methods and systems for large-scale neural network (NN) training. The method performed by a server system includes accessing a training data set associated with each entity of a plurality of entities from one or more data sources. The method includes initializing global NN parameters of a global NN model. The method includes instancing local neural network parameters of each entity-specific NN model based on the global neural network parameters. The method includes training a plurality of entity-specific NN models based on corresponding training data set associated with each of the plurality of entities. The method includes calculating a consolidated loss value based on a combination of validation loss values calculated for the plurality of entity-specific NN models and attention weights associated with the plurality of entity-specific NN models. The method includes updating the global neural network parameters of the global NN model based on the consolidated loss value.
Claims:CLAIMS
We claim:
1. A computer-implemented method, comprising:
accessing, by a server system, a training data set associated with each entity of a plurality of entities from one or more data sources;
initializing, by the server system, global neural network parameters of a global neural network model;
instancing, by the server system, local neural network parameters of a plurality of entity-specific neural network models based, at least in part, on the global neural network parameters, each entity-specific neural network model associated with each entity of the plurality of entities;
training, by the server system, the plurality of entity-specific neural network models based, at least in part, on the corresponding training data set associated with each entity of the plurality of entities;
accessing, by the server system, a validation data set associated with each entity from the one or more data sources;
calculating, by the server system, a consolidated loss value based, at least in part, a combination of validation loss values calculated for the plurality of entity-specific neural network models and attention weights of the plurality of entity-specific neural network models; and
updating, by the server system, the global neural network parameters of the global neural network model based, at least in part, on the consolidated loss value, wherein the global neural network model is trained based on joint learnings of the plurality of entity-specific neural network models with facilitation of transfer learning methods.
2. The computer-implemented method as claimed in claim 1, further comprising:
fine-tuning, by the server system, the local neural network parameters of the plurality of entity-specific neural network models based, at least in part, on updated global neural network parameters of the global neural network model and the attention weights of the plurality of entity-specific neural network models.
3. The computer-implemented method as claimed in claim 1, wherein the attention weights of the plurality of entity-specific neural network models are adapted based, at least in part, on output values of the plurality of entity-specific neural network models during validation phase.
4. The computer-implemented method as claimed in claim 1, further comprising:
scaling, by the server system, the training data set of each entity based, at least in part, on an autoencoder.
5. The computer-implemented method as claimed in claim 1, further comprising:
calculating, by the server system, the validation loss values for the plurality of entity-specific neural network models based, at least in part, on a loss function.
6. The computer-implemented method as claimed in claim 1, further comprising:
identifying, by the server system, a set of entities with a similar change in the validation loss values of the plurality of entity-specific neural network models to aggregate learnings of entity-specific neural network models of the set of entities, thereby reducing computational complexity and space requirements, wherein the set of entities with the similar change in the validation loss values is identified based at least in part, on cosine similarity measure.
7. The computer-implemented method as claimed in claim 1, further comprising:
initializing, by the server system, an entity-specific neural network model associated with a new entity based, at least in part, on the global neural network parameters of the global neural network model, the global neural network model previously trained based on consolidated learnings of the plurality of entity-specific neural network models; and
training, by the server system, the entity-specific neural network model based, at least in part, on training data of the new entity.
8. The computer-implemented method as claimed in claim 7, wherein the plurality of entities represents financial entities including one or more acquirers or one or more issuers.
9. The computer-implemented method as claimed in claim 7, wherein an entity-specific neural network model associated with a particular financial entity is trained based on generalized learnings of the financial entities, and wherein the entity-specific neural network model is configured to predict network scores of cardholders associated with the particular financial entity.
10. A server system configured to perform the computer-implemented method as claimed in claims 1-9.
11. A computer-implemented method, comprising:
accessing, by a server system, a training data set associated with each financial entity of a plurality of financial entities from one or more financial data sources, each of the plurality of financial entities being at least one of: an acquirer and an issuer;
initializing, by the server system, global neural network parameters of a global neural network model;
instancing, by the server system, local neural network parameters of a plurality of entity-specific neural network models based, at least in part, on the global neural network parameters, each entity-specific neural network model associated with each financial entity of the plurality of financial entities;
training, by the server system, the plurality of entity-specific neural network models based, at least in part, on the corresponding training data set associated with each financial entity of the plurality of financial entities;
accessing, by the server system, a validation data set associated with each financial entity from the one or more financial data sources;
calculating, by the server system, a consolidated loss value based, at least in part, a combination of validation loss values calculated for the plurality of entity-specific neural network models and attention weights of the plurality of entity-specific neural network models; and
updating, by the server system, the global neural network parameters of the global neural network model based, at least in part, on the consolidated loss value, wherein the global neural network model is trained based on joint learnings of the plurality of entity-specific neural network models with facilitation of transfer learning methods, and
wherein an entity-specific neural network model associated with a particular financial entity is configured to predict network scores of cardholders associated with the particular financial entity while generalizing learnings of the entity-specific neural network model from the plurality of financial entities.
, Description:
FORM 2
THE PATENTS ACT 1970
(39 of 1970)
&
The Patent Rules 2003
COMPLETE SPECIFICATION
(refer section 10 & rule 13)
TITLE OF THE INVENTION:
METHODS AND SYSTEMS FOR TRAINING OF LARGE-SCALE NEURAL NETWORK MODELS OF MULTIPLE ENTITIES
APPLICANT(S):
Name:
Nationality:
Address:
MASTERCARD INTERNATIONAL INCORPORATED
United States of America
2000 Purchase Street, Purchase, NY 10577, United States of America
PREAMBLE TO THE DESCRIPTION
The following specification particularly describes the invention and the manner in which it is to be performed.
DESCRIPTION
(See next page)
METHODS AND SYSTEMS FOR TRAINING OF LARGE-SCALE NEURAL NETWORK MODELS OF MULTIPLE ENTITIES
TECHNICAL FIELD
The present disclosure relates to artificial intelligence processing systems and, more particularly to, electronic methods and complex processing systems for training large-scale neural network models of multiple entities.
BACKGROUND
Over the last few years, there has been a rapid increase in the usage of computer algorithms based on technologies such as machine learning (ML), artificial intelligence (AI), neural networks (NN), Internet of Things (IoT), and so on. For building any computer application based on the above-mentioned algorithms, there is a requirement for a large amount of data. For example, during training of a neural network (NN) model, the NN model needs to be trained with a large amount of training data. Additionally, the NN model may be trained with training data coming from various diverse sources (e.g., training data may be fetched from different geographies, different issuers, various merchants, etc.). However, training the NN model with such a large amount of training data may become a bottleneck because the NN model may not be properly trained and deployed because of the training data at a large scale. The training of the NN model with training data extracted from various diverse sources may further become a challenge. Furthermore, re-training of the NN models is difficult due to a large amount of the training data and variance in data sources from which the training data is extracted.
In view of the above discussion, there exists a technological need for a method of training large-scale neural networks.
SUMMARY
Various embodiments of the present disclosure provide methods and systems for training large-scale neural network models.
In an embodiment, a computer-implemented method is disclosed. The computer-implemented method performed by a server system includes accessing, by a server system, a training data set associated with each entity of a plurality of entities from one or more data sources. The computer-implemented method includes initializing, by the server system, global neural network parameters of a global neural network model. The computer-implemented method includes instancing, by the server system, local neural network parameters of each entity-specific neural network (NN) model of a plurality entity-specific neural network (NN) models based, at least in part, on the global neural network parameters. The computer-implemented method includes training, by the server system, the plurality of entity-specific neural network models based, at least in part, on the corresponding training data set associated with each entity of the plurality of entities. The computer-implemented method includes accessing, by the server system, a validation data set associated with each entity from the one or more data sources. The computer-implemented method includes calculating, by the server system, a consolidated loss value based, at least in part, on a combination of validation loss values calculated for the plurality of entity-specific neural network models and attention weights associated with the plurality of entity-specific neural network models. The computer-implemented method further includes updating, by the server system, the global neural network parameters of the global neural network model based, at least in part, on the consolidated loss value. The global neural network model is trained based on joint learnings of the plurality of entity-specific neural network models with the facilitation of transfer learning methods.
Other aspects and example embodiments are provided in the drawings and the detailed description that follows.
BRIEF DESCRIPTION OF THE FIGURES
or a more complete understanding of example embodiments of the present technology, reference is now made to the following descriptions taken in connection with the accompanying drawings in which:
FIG. 1A illustrates an exemplary representation of an environment related to at least some example embodiments of the present disclosure;
FIG. 1B illustrates another exemplary representation of an environment related to at least some example embodiments of the present disclosure;
FIG. 2 illustrates a simplified block diagram of a server system, in accordance with an embodiment of the present disclosure;
FIG. 3A represents an architecture of a global neural network model instanced based on a plurality of entity-specific NN models, in accordance with an embodiment of the present disclosure;
FIG. 3B represents an architecture of a global neural network model instanced based on a plurality of entity-specific NN models after addition of a new entity-specific NN model, in accordance with an embodiment of the present disclosure;
FIG. 4 is a schematic representation of the global NN model and entity-specific NN models of the plurality of entities, in accordance with an embodiment of the preset disclosure;
FIG. 5 represents a flow chart of a training phase for a large scale neural network, in accordance with an embodiment of the present disclosure;
FIG. 6 represents a flow chart of a process flow of training an entity-specific NN model for a new entity, in accordance with an embodiment of the present disclosure;
FIG. 7 represents a flow chart of a process flow of joint training of a plurality of financial entities, in accordance with an embodiment of the present disclosure; and
FIG. 8 illustrates a flow diagram depicting a method for large-scale neural network training for multiple entities, in accordance with an embodiment of the present disclosure.
The drawings referred to in this description are not to be understood as being drawn to scale except if specifically noted, and such drawings are only exemplary in nature.
DETAILED DESCRIPTION
In the following description, for purposes of explanation, numerous specific details are set forth in order to provide a thorough understanding of the present disclosure. It will be apparent, however, to one skilled in the art that the present disclosure can be practiced without these specific details.
Reference in this specification to “one embodiment” or “an embodiment” means that a particular feature, structure, or characteristic described in connection with the embodiment is included in at least one embodiment of the present disclosure. The appearance of the phrase “in an embodiment” in various places in the specification are not necessarily all referring to the same embodiment, nor are separate or alternative embodiments mutually exclusive of other embodiments. Moreover, various features are described which may be exhibited by some embodiments and not by others. Similarly, various requirements are described which may be requirements for some embodiments but not for other embodiments.
Moreover, although the following description contains many specifics for the purposes of illustration, anyone skilled in the art will appreciate that many variations and/or alterations to said details are within the scope of the present disclosure. Similarly, although many of the features of the present disclosure are described in terms of each other, or in conjunction with each other, one skilled in the art will appreciate that many of these features can be provided independently of other features. Accordingly, this description of the present disclosure is set forth without any loss of generality to, and without imposing limitations upon, the present disclosure.
The term "payment network", used herein, refers to a network or collection of systems used for the transfer of funds through the use of cash substitutes. Payment networks may use a variety of different protocols and procedures to process the transfer of money for various types of transactions. Transactions that may be performed via a payment network may include product or service purchases, credit purchases, debit transactions, fund transfers, account withdrawals, etc. Payment networks may be configured to perform transactions via cash-substitutes, which may include payment cards, letters of credit, checks, financial accounts, etc. Examples of networks or systems configured to perform as payment networks include those operated by such as Mastercard®.
The term "data sources", used throughout the description, refers to devices, databases, cloud storages, or server systems that are capable of generating/sending data associated with various components incorporated in them. The data sources may transmit data to a server system or any external device that can be used to train various models and further detect an anomaly, predict the next occurrence, etc., based on the training.
OVERVIEW
Various embodiments of the present disclosure provide methods, systems electronic devices, and computer program products for training large-scale neural network models. More specifically, embodiments of the present disclosure disclose a method for training a global neural network (NN) model and a plurality of entity-specific neural network (NN) models. In one embodiment, the global neural network model and each of the plurality of entity-specific neural network models correspond to a dynamic recurrent neural network (DRNN).
In general, a neural network is a network of artificial neurons or nodes that mimic the way a human brain operates to recognize patterns and solve problems in the field of artificial intelligence, machine learning, deep learning, and the like. In addition, a recurrent neural network (RNN) is a type of artificial neural network that can process dynamic temporal sequence data by providing an output from the previous set of neurons as an input to the current set of neurons of the neural network. In general, a dynamic recurrent neural network (DRNN) model is a complex neural network that analyzes time-dependent past information to predict current state and future state information.
As noted above, any NN model needs to be trained on training data. More specifically, during training of the NN model, the neural network is trained on a training data set and further validated on a validation data set. The training data may include data associated with entities from various sources. There may exist no relationship between any two entities belonging to the same training data set. For example, the training data set may include data belonging to different issuers, different acquirers, different geographies, different merchants, and the like. In addition, re-training of the NN model again becomes challenging due to at least: (i) large-scale data, and (ii) data coming from various sources.
To overcome such problems or limitations, the present disclosure describes a server system that is configured to train the global NN model and the plurality of entity-specific NN models corresponding to the plurality of entities. In one embodiment, the plurality of entities may include financial entities such as acquirer, issuer, and the like.
At least one of the technical problems addressed by the present disclosure includes: (i) large-scale efficient training of neural networks, and (ii) computational complexity during training and re-training of neural networks.
The server system includes at least a processor and a memory. In one non-limiting example, the server system is a payment server. The server system is configured to access the training data set associated with each entity of a plurality of entities from one or more data sources. In one example, the training data set is associated with each of the plurality of entities.
The server system is configured to generate a plurality of data features corresponding to each entity, based, at least in part, on the training data set.
The server system is configured to generate an input scalar value for the plurality of data features based, at least in part, on the set of data samples corresponding to each entity. The server system transmits the input scalar value as an input to a neural network (NN) model. The NN model corresponds to an autoencoder architecture. The server system is configured to perform processing on the input scalar value with the facilitation of the NN model for scaling the plurality of data features for each entity of the plurality of entities. The server system, with the execution of the autoencoder, reconstructs the plurality of data features for generating standardized and scaled data instances for each entity of the one or more entities.
The server system is configured to initialize the global neural network parameters of the global NN model. In addition, the server system is configured to instance local neural network parameters of each entity-specific neural network model of a plurality entity-specific neural network models based, at least in part, on the global neural network parameters. The server system is further configured to train the plurality of entity-specific neural network models based, at least in part, on the corresponding training data set associated with each entity of the plurality of entities. In one embodiment, each entity-specific neural network model is trained based on the training data set corresponding to the entity. In one example, a first entity-specific neural network model is trained for a first entity, a second entity-specific neural network model is trained for a second entity, a third entity-specific neural network model is trained for a third entity, and so on. In an embodiment, a number of entity-specific NN models depends on the number of entities.
Furthermore, the server system is configured to access a validation data set associated with each entity from the one or more data sources. Moreover, the server system is configured to calculate a consolidated loss value based, at least in part, on a combination of validation loss values calculated for the plurality of entity-specific NN models and attention weights associated with the plurality of entity-specific NN models. The server system is also configured to update the global neural network parameters (e.g., consolidated weight) of the global NN model based, at least in part, on the consolidated loss value. The global NN model is trained based on learnings of the plurality of entity-specific NN models with the facilitation of transfer learning methods. In one embodiment, the global neural network parameters may include a consolidated weight value associated with the global NN model.
The server system is configured to calculate validation loss values for each entity-specific NN model associated with the corresponding entity of the plurality of entities. In addition, the server system is configured to aggregate similar entity-specific NN models together based, at least in part, on the calculated validation loss values for reducing computational complexity. The server system aggregates the similar entity-specific NN models together for identifying similar entities of the plurality of entities showing similar data patterns. In one embodiment, the server system is configured to identify a set of entities with a similar change in validation loss values of the entity-specific NN models to aggregate learnings of the entity-specific NN models, thereby reducing computational complexity and space requirements.
After training the global neural network model, when a new entity is to be added to the plurality of entities, the global neural network model needs not to be trained from scratch. The server system is configured to fine-tune a validation loss value of a new entity-specific NN model based, at least in part, on the global neural network parameters of the global neural network model. In addition, the new entity-specific NN model is created as an instance of the global neural network model for the corresponding new entity. In one embodiment, the attention weights of the entity-specific NN models are adapted based, at least in part, on output values of the entity-specific NN models during the validation phase.
The server system is configured to train the global neural network model to provide a network score as an output. In one embodiment, the network score may include a card lifetime value score, decision intelligence score, and the like.
Various embodiments of the present disclosure offer multiple advantages and technical effects. For instance, the present disclosure performs a large-scale training of various neural networks in parallel. The present disclosure describes the training of neural networks having high variance in input data (e.g., data coming from different categories from various sources). The present disclosure enables the NN model of entities with the lower dataset to perform better since learnings are shared across the entities. Further, the present disclosure implements one-shot training across the entities. The present disclosure provides better performance and reduced computational complexity. Furthermore, the present disclosure provides an improvement in precision in a range of around 2% to 15% as compared to dynamic recurrent neural network (DRNN) models.
Various example embodiments of the present disclosure are described hereinafter with reference to FIGS. 1A to 8.
FIG. 1A illustrates an exemplary representation of an environment 100 related to at least some example embodiments of the present disclosure. Although the environment 100 is presented in one arrangement, other embodiments may include the parts of the environment 100 (or other parts) arranged otherwise depending on, for example, training of large-scale neural network models, etc. The environment 100 generally includes a server system 102, a model database 104, a plurality of entities 106a, 106b…106n, each coupled to, and in communication with (and/or with access to) a network 110. The network 110 may include, without limitation, a light fidelity (Li-Fi) network, a local area network (LAN), a wide area network (WAN), a metropolitan area network (MAN), a satellite network, the Internet, a fiber-optic network, a coaxial cable network, an infrared (IR) network, a radio frequency (RF) network, a virtual network, and/or another suitable public and/or private network capable of supporting communication among the entities illustrated in FIG. 1, or any combination thereof.
Various entities in the environment 100 may connect to the network 110 in accordance with various wired and wireless communication protocols, such as Transmission Control Protocol and Internet Protocol (TCP/IP), User Datagram Protocol (UDP), 2nd Generation (2G), 3rd Generation (3G), 4th Generation (4G), 5th Generation (5G) communication protocols, Long Term Evolution (LTE) communication protocols, or any combination thereof. For example, the network 110 may include multiple different networks, such as a private network made accessible by the network 110 to the server system 102, and a public network (e.g., the Internet).
Examples of the plurality of entities 106a-106n may include, but are not limited to, medical facilities (e.g., hospitals, laboratories, etc.), financial institutions, educational institutions, government agencies, and telecom industries. Each entity is associated with a local neural network model that is trained based on corresponding entity-specific training datasets.
The server system 102 is configured to perform one or more of the operations described herein. In general, the server system 102 is configured to perform training of large-scale neural network models. The server system 102 is a separate part of the environment 100, and may operate apart from (but still in communication with, for example, via the network 110, the plurality of entities 106a-106n, and any third party external servers (to access data to perform the various operations described herein). However, in other embodiments, the server system 102 may be incorporated, in whole or in part, into one or more parts of the environment 100, for example, an entity 106a. In addition, the server system 102 should be understood to be embodied in at least one computing device in communication with the network 110, which may be specifically configured, via executable instructions, to perform as described herein, and/or embodied in at least one non-transitory computer-readable media.
In one embodiment, the server system 102 is configured to receive datasets from one or more data sources 108 associated with the plurality of entities 106a-106n including, for example, data repositories. The server system 102 is configured to utilize an autoencoder for scaling datasets of multiple entities in a network. The autoencoder learns data patterns of each entity and provides entity-specific scaled data instances corresponding to each entity-specific neural network model.
In general, the neural network model may refer to a model with problem-solving abilities, composed of artificial neurons (nodes) forming a network by a connection of synapses. The neural network model may be defined by a connection network between neurons on different layers, a learning process for updating model parameters, and an activation function for generating an output value. The neural network model may include an input layer, an output layer, and may selectively include one or more hidden layers. Each layer includes one or more neurons, and the neural network model may include synapses that connect the neurons to one another. In a neural network, each neuron may output a function value of an activation function with respect to the input signals inputted through a synapse, weight, and bias. A neural network parameter refers to a parameter determined through learning and may include the weight of synapse connection, the bias of a neuron, and the like. Moreover, hyper-parameters refer to parameters that are set before learning in a machine learning algorithm and include a learning rate, a number of iterations, a mini-batch size, an initialization function, and the like. The objective of training the neural network model is to determine a neural network parameter for significantly reducing a loss function. The loss function may be used as an indicator for determining an optimal model parameter in the learning process of the neural network model.
In one embodiment, the server system 102 is configured to train each entity-specific NN model based on the entity-specific scaled data instances. Thereafter, the server system 102 is configured to provide validation datasets to the entity-specific NN models and a consolidated loss value based on a combination of validation loss values of each entity-specific NN model is calculated. Based on the consolidated loss value, the server system 102 is configured to update the neural network weights of a global neural network model. The global neural network model is configured to train each entity-specific NN model based on transfer learning methods. Hence, the server system 102 is configured to train an entity-specific NN model of an entity based on its associated training dataset while generalizing learnings from other entity-specific NN models.
In one example, the plurality of entities 106a-106n may refer to facilities for which the server system 102 is configured to perform the large-scale neural network training. In one embodiment, the model database 104 provides a storage location for neural network models associated with the plurality of entities 106a-106n.
In one example, the one or more data sources 108 may include data centers, repositories, data warehouses, and the like. In another example, the one or more data sources 108 may include a plurality of servers (e.g., application server, web server, media server etc.). In an example, the one or more data sources 108 may include entity-specific information related to each of the plurality of entities 106a-106n.
In an embodiment, information associated with the plurality of entities 106a-106n is accessed from the one or more data sources 108. In another embodiment, information associated with the plurality of entities 106a-106n is accessed from the model database 104.
FIG. 1B illustrates another exemplary representation of an environment 120 related to at least some example embodiments of the present disclosure. Although the environment 120 is presented in one arrangement, other embodiments may include the parts of the environment 120 (or other parts) arranged otherwise depending on, for example, training of large-scale neural network models, etc. The environment 120 generally includes a server system 122, a model database 124, a plurality of financial entities 126a, 126b …, 126n, one or more financial data sources 128, each coupled to, and in communication with (and/or with access to) a network 130. The network 130 may include, without limitation, a light fidelity (Li-Fi) network, a local area network (LAN), a wide area network (WAN), a metropolitan area network (MAN), a satellite network, the Internet, a fiber-optic network, a coaxial cable network, an infrared (IR) network, a radio frequency (RF) network, a virtual network, and/or another suitable public and/or private network capable of supporting communication among the entities illustrated in FIG. 1B, or any combination thereof.
Various entities in the environment 120 may connect to the network 130 in accordance with various wired and wireless communication protocols, such as Transmission Control Protocol and Internet Protocol (TCP/IP), User Datagram Protocol (UDP), 2nd Generation (2G), 3rd Generation (3G), 4th Generation (4G), 5th Generation (5G) communication protocols, Long Term Evolution (LTE) communication protocols, or any combination thereof. For example, the network 130 may include multiple different networks, such as a private network made accessible by the network 130 to the server system 122, and a public network (e.g., the Internet).
Examples of the plurality of financial entities 126a-126n may include, but are not limited to, financial entities, non-banking financial companies, asset management companies, and financial institutions. Each financial entity is associated with a local neural network model that is trained based on corresponding entity-specific training datasets. In an example, each of the plurality of financial entities 126a-126n is an issuer. In another example, each of the plurality of financial entities 126a-126n is an acquirer.
In one embodiment, the acquirer server is associated with a financial institution (e.g., a bank) that processes financial transactions. This can be an institution that facilitates the processing of payment transactions for physical stores, merchants, or an institution that owns platforms that make online purchases or purchases made via software applications possible (e.g., shopping cart platform providers and in-app payment processing providers). The terms “acquirer”, “acquiring bank”, “acquiring bank” or “acquirer server” will be used interchangeably herein.
In one embodiment, the issuer server is associated with a financial institution normally called an "issuer bank" or "issuing bank" or simply "issuer", in which a cardholder may have a payment account, (which also issues a payment card, such as a credit card or a debit card), and provides microfinance banking services (e.g., payment transaction using credit/debit cards) for processing electronic payment transactions, to the cardholder.
In an example, the one or more financial data sources 128 may include data centers, repositories, data warehouses, and the like. In another example, the one or more financial data sources 128 may include a plurality of servers (e.g., application server, web server, media server). In an example, the one or more financial data sources 128 may include entity-specific information related to each of the plurality of financial entities 126a-126n (collectively represented as financial entities 126). In one example, the one or more financial data sources 128 may include information such as different types of payment cards issued by a financial entity, payment cards issued to a specific cardholder, payment card information such as type of payment card, card number, expiry date, name of cardholder, daily withdrawal limit associated with a particular payment card, etc.
In an embodiment, information associated with the plurality of financial entities 126a-126n is accessed from the one or more financial data sources 128. In another embodiment, information associated with the plurality of financial entities 126a-126n is accessed from the model database 124. The model database 124 provides a storage location for storing neural network models associated with the plurality of financial entities 126a-126n.
The server system 122 is configured to perform one or more of the operations described herein. In particular, the server system 122 is configured to train the global neural network model and the plurality of entity-specific NN models. In one example, the global neural network model includes instances of the plurality of entity-specific NN models. Each entity-specific NN model is trained corresponding to the financial entity of the plurality of financial entities 126a-126n. In an example, a first entity-specific NN model is trained corresponding to a first financial entity (e.g., issuer server A), a second entity-specific NN model is trained corresponding to a second financial entity (e.g., issuer server B), a third entity-specific NN model is trained corresponding to a third financial entity (e.g., acquirer server C), and the like. The server system 122 is configured to train the first entity-specific NN model while generalizing learnings of other financial entities to generate a network score as an output. In one embodiment, the network score may refer to a card lifetime value score, decision intelligence score, and the like. In one implementation of the present disclosure, the server system 122 is configured to train the first entity-specific NN model of issuer server A in such a way learnings (e.g., fraud patterns) of other financial entities are shared and the first entity-specific NN model is configured to determine fraud patterns across cardholders of the issuer server A.
In one example, the decision intelligence score may represent Mastercard® Decision Intelligence (registered as a trademark) product powered by Mastercard® that provides decision and fraud detection service. The product uses artificial intelligence technology to help financial institutions increase the accuracy of real-time approvals of genuine transactions and reduce false declines.
In one embodiment, the payment network 134 may be used by the payment card issuing authorities as a payment interchange network. The payment network 134 may include a plurality of payment servers such as, the payment server 132. Examples of payment interchange networks include, but are not limited to, Mastercard® payment system interchange network. The Mastercard® payment system interchange network is a proprietary communications standard promulgated by Mastercard International Incorporated® for the exchange of financial transactions among a plurality of financial activities that are members of Mastercard International Incorporated®. (Mastercard is a registered trademark of Mastercard International Incorporated located in Purchase, N.Y.).
The number and arrangement of systems, devices, and/or networks shown in FIG. 1B is provided as an example. There may be additional systems, devices, and/or networks; fewer systems, devices, and/or networks; different systems, devices, and/or networks; and/or differently arranged systems, devices, and/or networks than those shown in FIG. 1B. Furthermore, two or more systems or devices shown in FIG. 1B may be implemented within a single system or device, or a single system or device shown in FIG. 1B may be implemented as multiple, distributed systems or devices. Additionally, or alternatively, a set of systems (e.g., one or more systems) or a set of devices (e.g., one or more devices) of the environment 120 may perform one or more functions described as being performed by another set of systems or another set of devices of the environment 120.
Referring now to FIG. 2, a simplified block diagram of a server system 200 is shown, in accordance with an embodiment of the present disclosure. The server system 200 is similar to the server system 102. In some embodiments, the server system 200 is embodied as a cloud-based and/or SaaS-based (software as a service) architecture.
The server system 200 includes a computer system 202 and a database 204. The computer system 202 includes at least one processor 206 for executing instructions, a memory 208, a communication interface 210, and a storage interface 214 that communicate with each other via a bus 212.
In some embodiments, the database 204 is integrated within the computer system 202. For example, the computer system 202 may include one or more hard disk drives as the database 204. A storage interface 214 is any component capable of providing the processor 206 with access to the database 204. The storage interface 214 may include, for example, an Advanced Technology Attachment (ATA) adapter, a Serial ATA (SATA) adapter, a Small Computer System Interface (SCSI) adapter, a RAID controller, a SAN adapter, a network adapter, and/or any component providing the processor 206 with access to the database 204. In one embodiment, the database 204 is configured to store an autoencoder 226, a global neural network (NN) model 228, and a plurality of entity-specific neural network (NN) models 230 (shown as entity-specific NN models 230 for simplicity).
Examples of the processor 206 include, but are not limited to, an application-specific integrated circuit (ASIC) processor, a reduced instruction set computing (RISC) processor, a complex instruction set computing (CISC) processor, a field-programmable gate array (FPGA), graphical processing unit (GPU), and the like. The memory 208 includes suitable logic, circuitry, and/or interfaces to store a set of computer-readable instructions for performing operations. Examples of the memory 208 include a random-access memory (RAM), a read-only memory (ROM), a removable storage drive, a hard disk drive (HDD), and the like. It will be apparent to a person skilled in the art that the scope of the disclosure is not limited to realizing the memory 208 in the server system 200, as described herein. In another embodiment, the memory 208 may be realized in the form of a database server or cloud storage working in conjunction with the server system 200, without departing from the scope of the present disclosure.
The processor 206 is operatively coupled to the communication interface 210 such that the processor 206 is capable of communicating with a remote device 216 such as, the payment server 132, or communicating with any entity connected to the network 110 (as shown in FIG. 1). In one embodiment, the processor 206 is configured to access a set of data samples associated with the plurality of entities 106 from the one or more data sources 108.
It is noted that the server system 200 as illustrated and hereinafter described is merely illustrative of an apparatus that could benefit from embodiments of the present disclosure and, therefore, should not be taken to limit the scope of the present disclosure. It is noted that the server system 200 may include fewer or more components than those depicted in FIG. 2.
In one embodiment, the processor 206 includes a data pre-processing engine 218, a scaling engine 220, a model training engine 222, and a scoring engine 224. It should be noted that components, described herein, such as the data pre-processing engine 218, the scaling engine 220, the model training engine 222, and the scoring engine 224 can be configured in a variety of ways, including electronic circuitries, digital arithmetic, and logic blocks, and memory systems in combination with software, firmware, and embedded technologies.
The data pre-processing engine 218 includes suitable logic and/or interfaces for accessing a set of data samples associated with each entity (e.g., entity 106a) from the one or more data sources 108. The set of data samples may depict particular characteristics of the entity. Examples of pre-processing operations performed by the data pre-processing engine 218 include normalization operations, splitting of datasets, merging of datasets, and other suitable preprocessing operations. In one embodiment, the data pre-processing engine 218 may split the labeled subset of the set of data samples into the training data set and validation data set. The data pre-processing engine 218 may randomly partition the set of data samples into k equal-sized subsets, one of which is then utilized as the validation data set, and the remaining k−1 compose the training data set. Further, the data pre-processing techniques may include feature aggregation, feature sampling, dimensionality reduction, feature encoding, data splitting, and the like. In one example, the data pre-processing engine 218 is configured to remove all the special characters and numbers from the data samples and convert the data into lowercase. The data is further clustered into a plurality of clusters by running a 2-step word2vec followed by K-Nearest neighbors clustering. This process of data pre-processing results in the quantification of the data points in the dataset along with cluster numbers.
Similarly, the data pre-processing engine 218 may adopt suitable data pre-processing techniques based on the data samples received from the one or more data sources 108. The data samples may include categorical data, numerical data, image data, and the like. The data pre-processing engine 218 is configured to quantify the data samples by utilizing suitable techniques based on the type of data present in the dataset. The data pre-processing engine 218 is further configured to generate a plurality of data features based at least in part on the set of data samples associated with each entity.
With reference to the FIG. 1B, the set of data samples may include, but are not limited to, financial data associated with each financial entity. In one example, the set of data samples may represent card types of cardholders associated with an issuer. In one embodiment, the plurality of data features may include card-level features, merchant-level features, and the like. In an embodiment, the merchant-level features may include features based on payment transactions of at least one merchant. In one example, the merchant-level features may include, but are not limited to, total purchase amount for a pre-determined duration spent at each merchant, total purchase amount spent by various cardholders possessing various card types for the pre-determined duration, total number of transactions by the various cardholders having different card types within the pre-determined duration, total number of online transactions performed at each merchant within the pre-determined duration, and total numbers of transactions involving a payment card at each merchant within the pre-determined time duration.
The scaling engine 220 includes suitable logic and/or interfaces for generating an input scalar value for the plurality of data features based, at least in part, on the set of data samples corresponding to each entity. In one example, a first input scalar value is generated for the first entity based on the set of data samples corresponding to the first entity, a second input scalar value is generated for the second entity based on the set of data samples corresponding to the second entity, and so on.
The input scalar value corresponding to each entity is provided as an input to the autoencoder 226. In an embodiment, an identifier associated with the corresponding entity of the plurality of entities 106 may also be fed as an input to the autoencoder 226. In another embodiment, an identifier associated with the corresponding financial entity of the plurality of financial entities 106 may also be fed as an input to the autoencoder 226. In one embodiment, the autoencoder 226 corresponds to an autoencoder architecture configured for scalability. In general, an autoencoder is a specialized type of artificial neural network that is configured to learn efficient coding of unlabeled data. The encoding is further validated and refined by attempting to regenerate the input from the encoding.
In one embodiment, the scaling engine 220 is fed with an identifier of the entity and the plurality of data features associated with the entity as an input during training. In addition, the scaling engine 220 is fed with desired output (e.g., scaled data features). In one example, the identifier is used to identify the entity of the plurality of entities 106 or the financial entity of the plurality of financial entities 126a-126n. The scaling engine 220 utilizes the autoencoder 226 to learn underlying distribution during the training phase.
In one embodiment, the set of data samples (unscaled) is fed as an input to the autoencoder 226 and further, output of the autoencoder 226 is evaluated against scaled features for the given set of data samples. This is how the scaling engine 220 enables the autoencoder 226 to generate scaled features from raw input. In one example, the autoencoder 226 is fed with input as data feature 1, data feature 2, data feature 3…data feature n. After processing the input, the autoencoder 226 provides output as scaled data feature 1, scaled data feature 2, scaled data feature 3,…scaled data feature n. The autoencoder 226 is configured to provide a scaled data feature corresponding to each data feature of the plurality of data features.
During the implementation phase, the autoencoder 226 is fed with the input scalar value (only one scalar) consisting of the identifier of the entity and the plurality of data features associated with the entity. In one embodiment, during the implementation phase, the autoencoder 226 is fed with the input scalar value (only one scalar) consisting of the identifier of the financial entity and the plurality of data features associated with the financial entity. The scaling engine 220 is configured to process on the input scalar value with the facilitation of the autoencoder 226 for scaling the plurality of data features for each entity of the plurality of entities 106. In other words, the autoencoder 226 reconstructs the plurality of data features for generating standardized and scaled data instances for each entity of the plurality of entities 106.
In one embodiment, the scaling engine 220 provides a scaled output (i.e., scaled data instances) for the plurality of entities 106 based on the input scalar value. The scaling engine 220 utilizes the autoencoder architecture to provide scaled output (i.e., scaled data instances) for each of the plurality of entities 106. The scaled data features are further provided as an input to the model training engine 222.
In general, autoencoders are a type of deep neural network model that can be used to reduce data dimensionality. Deep neural network models are composed of many layers of neural units, and in autoencoders, every pair of adjacent layers forms a full bipartite graph of connectivity. The layers of an autoencoder collectively create an hourglass figure where the input layer is large and subsequent layer sizes reduce in size until the center-most layer is reached. From there until the output layer, layer sizes expand back to the original input size
The model training engine 222 includes suitable logic and/or interfaces for training an entity-specific NN model for each entity based at least in part, on the corresponding training data set. At first, the model training engine 222 is configured to initialize global neural network parameters of the global NN model 228. In one embodiment, the model training engine 222 is configured to initialize the global neural network parameters (e.g., neural network weights or biases) of the global NN model 228 based on one or more initialization methods (e.g., Xavier initialization). In general, Xavier initialization is an attempt to initialize the weights of a neural network, such that the variance of activations is the same across every layer.
Thereafter, the model training engine 222 is configured to create instances of the global neural network parameters for the plurality entity-specific NN models. In an example, there are 10 different entities. To train 10 different entity-specific NN models, the model training engine 222 creates 10 instances of the global neural network parameters of the global NN model 228.
During entity-specific training, the entity-specific NN model of an entity is fed with a scaled dataset corresponding to the entity. Each of the plurality of entity-specific NN models may include an input layer, multiple hidden layers, and an output layer. In one example, each of the plurality of entity-specific NN models may include ‘n’ number of model weights. The model training engine 222 is configured to calculate validation loss values for each entity-specific NN model associated with the corresponding entity of the plurality of entities 106 based, at least in part, on the local neural network parameters (e.g., weights and biases) of each entity-specific NN model.
Once the plurality of entity-specific NN models gets trained, the model training engine 222 is configured to provide the validation dataset of the entity to the respective entity-specific NN model. The model training engine 222 is configured to calculate a validation loss value corresponding to the validation data set for the entity. In a similar fashion, the model training engine 222 is configured to calculate the validation loss values for other trained entity-specific NN models.
In one embodiment, the model training engine 222 is configured to calculate an attention weight (i.e., learnable parameter) corresponding to each entity-specific NN model. In one embodiment, the attention weights facilitate the global neural network model 228 to decide which entity-specific NN model should be given more weightage during training and which entity-specific NN model should be given less weightage. In other words, the attention weight for each entity-specific NN model is configured according to the neural network parameters of the entity-specific NN model.
Thereafter, the model training engine 222 is configured to calculate a consolidated loss value based on a combination of validation loss values of the entity-specific NN models and the attention weights. In one embodiment, the validation loss values may be combined in a linear manner such as averaging or weighted averaging or may be combined in a non-linear manner. It should be noted that the operation of combining the plurality of losses is an optional technical solution, and combining is not necessary for actuality. Hence, the validation loss values are consolidated using weighted averaging. However, the weight vector of each loss term is not pre-determined; instead learned using an attention-based mechanism.
The model training engine 222 is further configured to update global neural network parameters (e.g., consolidated weight value) of the global neural network model 228 based, at least in part, on the consolidated loss value. In one embodiment, the global neural network parameters (e.g., weights and biases) of the global neural network model 228 are updated based, at least in part, on the consolidated weight value. The training engine 222 is configured to perform the updating of the global neural network parameters until the calculated consolidated loss value is above a threshold value.
The updated global neural network parameters further facilitate the training of each of the plurality of entity-specific NN models with transfer learning methods. In general, transfer learning focuses on gaining knowledge by solving one task and using the same knowledge to solve another task. In other words, transfer learning is used in machine learning to re-use a pre-trained model on a new problem.
In one embodiment, the model training engine 222 is also configured to aggregate similar entity-specific NN models together based, at least in part, on the calculated validation loss values for reducing computational complexity. In one embodiment, the similar entity-specific NN models are aggregated together for identifying similar entities of the plurality of entities 106 showing similar data patterns.
In one example, when calculated validation loss value for an entity-specific NN model corresponding to an entity does not improve over a period of time (e.g., 1 month, 2 months, etc.), the model training engine 222 is configured to adjust or update the global neural network parameters of the global neural network model 228 accordingly.
During the final epoch, the model training engine 222 is configured to store the local neural network parameters (i.e., weights and biases) of the entity-specific NN models, the global neural network parameters (i.e., consolidated weight value), and validation loss values and attention weights of the entity-specific NN models. In one embodiment, the attention weights of the entity-specific NN models 230 are adapted based, at least in part, on output values of the entity-specific NN models 230 during the validation phase.
In one embodiment, during the training phase, the model training engine 222 is configured to train the plurality of entity-specific NN models 230. In addition, attention weights are used to calculate the consolidated loss value (on validation data) for updating the parameters of the global NN model 228 during the training phase.
The scoring engine 224 includes suitable logic and/or interfaces for generating/calculating a prediction score based at least on a trained entity-specific NN model.
FIG. 3A represents an architecture 300 of a global neural network model 302 instanced based on a plurality of entity-specific NN models 304, in accordance with an embodiment of the present disclosure. As shown in FIG. 3A, the architecture 300 includes the global neural network (NN) model 302 instanced based on the plurality of entity-specific NN models 304. The global neural network model 302 is identical to the global neural network model 228. The plurality of entity-specific NN models 304 is identical to the plurality of entity-specific NN models 230. In addition, the plurality of entity-specific NN models 304 includes a first entity-specific NN model 304a, a second entity-specific NN model 304b, a third entity-specific NN model 304c, …., and a last entity-specific NN model 304n.
In one embodiment, the training engine 222 is configured to generate the plurality of entity-specific NN models 304 based, at least in part, on the plurality of entities 106. In addition, the training engine 222 is configured to instance the global neural network model 302 based, at least in part, on the plurality of entity-specific NN models 304. Each of the entity-specific NN models is trained based on the corresponding entity of the plurality of entities 106. In one example, each of the entity-specific NN models is an entity-specific NN model being trained based on the corresponding financial entity of the plurality of financial entities 126a-126n.
In one embodiment, a number of the plurality of entity-specific NN models 304 is equal to the number of the plurality of entities 106 for which the global neural network model 302 is to be trained. In addition, each of the plurality of entity-specific NN models 304 corresponds to a dynamic recurrent neural network (DRNN).
In general, the structure of a DRNN is complex than a static recurrent neural network (RNN). Because of special interconnections in the neural network, the DRNNs enable the analysis of time-dependent data. DRNN may include an input layer, hidden layers, and an output layer. In general, once input data is transferred to a specific network element (e.g., any of the hidden layer of neurons), input data is stored in a memory and integrated with subsequent inputs. Therefore, DRNN provides a possibility to use past input (past information) to predict current and future states.
Each of the plurality of entity-specific NN models 304 includes an input layer, hidden layers, and an output layer (not shown in the figures). Each of the plurality of entity-specific NN models 304 is associated with a local neural network parameter θ. In one example, the first entity-specific NN model 304a is associated with a local neural network parameter θ1, the second entity-specific NN model 304b is associated with a local neural network parameter θ2, the third entity-specific NN model 304c is associated with a local neural network parameter θ3, …. the last entity-specific NN model 304n is associated with a local neural network parameter θn, and so on. (as shown in FIG. 3A)
In one example, the local neural network parameter represents weights and biases. In addition, the server system 200 is configured to calculate a validation loss value corresponding to each of the plurality of entity-specific NN models 304. In one example, the server system 200 is configured to calculate a first validation loss value ΔL1 corresponding to the first entity-specific NN model 304a, a second validation loss value ΔL2 corresponding to the second entity-specific NN model 304b, a third validation loss value ΔL3 corresponding to the third entity-specific NN model 304c, …, and a last validation loss value ΔLn corresponding to the last entity-specific NN model 304n. (as shown in FIG. 3A)
The server system 200 is further configured to calculate a consolidated loss value ΔLval based on the combination of the weight values ΔL1 to ΔLn. In one embodiment, the consolidated weight value ΔLval is equal to a weighted average of the combination of the weight values ΔL1 to ΔLn. The consolidated weight value ΔLval is further back-propagated to the global neural network model 302 to calculate the consolidated weight value of the global neural network model 302.
In one example, the server system 200 is configured to train the global neural network model 302 to calculate the card lifetime value network score. In one embodiment, the card lifetime value network score may be calculated for a financial entity (e.g., issuer server or acquirer server). In general, the card lifetime value is a network score that is calculated by following a hierarchical structure (e.g., country (e.g., Singapore, India, Australia, etc.) includes various card product types, and further various card product types include a plurality of issuers or entities). In the above example, there may be n number of card product types in a specific country. In addition, there may be n number of issuers that issue the various card product types. Here, n is a natural number.
For training the global neural network model 302 for calculating the card lifetime value network score, training data may include card-level features (e.g., number sequences, chip or QR-code embedded in the payment card, textual data, logo image, security feature, color, component placement in the payment card, background, card name, card issuer name, card type, contactless payment feature, card network, etc.) to classify each card as either one of Premium, High, Enhanced, Medium, and Low. As stated above, the calculation of the card lifetime value network score follows a hierarchical structure, and thus, the server system 200 is configured to train the plurality of entity-specific NN models 230 at different levels of hierarchy to obtain performance gain.
In one example, for training the global neural network model 302, the training data may include the card-level feature of the weekly aggregated sum of interchange fees for a given financial entity. Initially, for scaling purposes, raw and corresponding scaled data is used for the training of the autoencoder 226. Once training of the autoencoder 226 is complete, the plurality of data features (i.e., raw features) and an identifier associated with the entity is provided as an input to the autoencoder 226. As a result, the autoencoder 226 provides scaled data instances for the entity.
The server system 200 is configured to train the global neural network model 302 and calculate the card lifetime value network score in such a scenario where there is a high variance in various issuers in terms of the number of cards issued by each of the plurality of entities (i.e., issuer) and behavior of cardholders. The aforementioned steps for training the global neural network model 302 are herein explained in detail with reference to FIG. 2, and therefore, they are not reiterated for the sake of brevity.
FIG. 3B represents an architecture 320 of a global neural network model 322 instanced based on a plurality of entity-specific NN models 324 after the addition of a new entity-specific NN model, in accordance with an embodiment of the present disclosure. As shown in FIG. 3B, the architecture 320 includes the global neural network (NN) model 322 instanced based on the plurality of entity-specific NN models 324. The global neural network model 322 is identical to the global neural network model 228. The plurality of entity-specific NN models 324 is identical to the plurality of entity-specific NN models 230. In addition, the plurality of entity-specific NN models 324 includes a first entity-specific NN model 324a, a second entity-specific NN model 324b, a third entity-specific NN model 324c, …., a last entity-specific NN model 324n, and a new entity-specific NN model 324n+1.
In one embodiment, the training engine 222 is configured to generate the plurality of entity-specific NN models 324 based, at least in part, on the plurality of entities 106. In addition, the training engine 222 is configured to instance the global neural network model 322 based, at least in part, on the plurality of entity-specific NN models 324. Each of the entity-specific NN models is trained based on the corresponding entity of the plurality of entities 106. In one example, each of the entity-specific NN models is an entity-specific NN model being trained based on the corresponding financial entity of the plurality of financial entities 126a-126n.
In one embodiment, a number of the plurality of entity-specific NN models 324 is equal to the number of the plurality of entities 106 for which the global neural network model 322 is to be trained. In addition, each of the plurality of entity-specific NN models 324 corresponds to a dynamic recurrent neural network (DRNN).
In one example, the local neural network parameter represents weights and biases. In addition, the server system 200 is configured to calculate a validation loss value corresponding to each of the plurality of entity-specific NN models 324 (as explained above). In one example, the server system 200 is configured to calculate a first validation loss value ΔL1 corresponding to the first entity-specific NN model 324a, a second validation loss value ΔL2 corresponding to the second entity-specific NN model 324b, a third validation loss value ΔL3 corresponding to the third entity-specific NN model 324c, …, and a last validation loss value ΔLn corresponding to the last entity-specific NN model 324n. (as shown in FIG. 3B)
The server system 200 is further configured to calculate a consolidated loss value ΔLval based on the combination of the weight values ΔL1 to ΔLn. In one embodiment, the consolidated weight value ΔLval is equal to a weighted average of the combination of the weight values ΔL1 to ΔLn. The consolidated weight value ΔLval is further backpropagated to the global neural network model 322 to calculate the consolidated weight value of the global neural network model 322.
In one embodiment, after training of the global neural network model 322 is complete, the server system 200 is configured to save the local neural network parameters (i.e., weights and biases) and the global neural network parameters (i.e., consolidated weight value) during the final epoch. The server system 200 is configured to train the global neural network model 322 for a new entity. In one example, the new entity is a financial entity (e.g., issuer, acquirer, etc.). For training the global neural network model 322 for the new entity, a new entity-specific NN model is added to the global neural network model 322. The global neural network model 322 is instanced again based on the addition of the new entity-specific NN model.
Instead of training the global neural network model 322 from scratch, the server system 200 is configured to fine-tune validation loss value of the new entity-specific NN model based, at least in part, on the global neural network parameters (i.e., the consolidated weight value) of the global neural network model 322. In one embodiment, the new entity-specific NN model is created as an instance of the global neural network model 322 for the new entity (e.g., issuer, acquirer, etc.). (as shown in FIG. 3B)
In one embodiment, fine-tuning is a separate process that is not performed during the training phase. The fine-tuning process is only required after the addition of the new entity-specific NN model if the existing model parameters (e.g., weights and biases) are not good enough to produce optimal results for the new entity. In one example, during fine-tuning, a copy of the global weight value is created, and the new entity-specific NN model is trained based on the set of data samples for the new entity only. The new entity-specific NN model is trained as an independent model and the global neural network parameters are not updated based on the new entity-specific NN model. This further provides an advantage of not training the new entity-specific NN model (i.e., for the new entity) from scratch and the training of the new entity-specific NN model may be performed with the lesser set of data samples.
FIG. 4 is a schematic representation 400 of the global NN model and entity-specific NN models of the plurality of entities, in accordance with an embodiment of the present disclosure. As mentioned previously, the processor 206 is configured to perform robust large-scale neural network training for the plurality of entities. In particular, a first neural network model for the first entity is fine-tuned based on simultaneous learnings of other neural network models. The processor 206 is configured to generalize learnings of each entity-specific NN model among all the entity-specific NN models. The processor 206 is configured to train global neural network model 402 based on combined learnings of the plurality of entity-specific NN models 404a-404n.
Each of the plurality of entity-specific NN models 404 includes an input layer, hidden layers, and an output layer (not shown in figures). Each of the plurality of entity-specific NN models 404a-404n is associated with local neural network parameter θ. In one example, the first entity-specific NN model 404a is associated with local neural network parameters θ1, the second entity-specific NN model 404b is associated with local neural network parameters θ2, …., n-th entity-specific NN model 404n is associated with local neural network parameters θn, and so on. (as shown in FIG. 4)
As explained above, the processor 206 is configured to access the set of data samples associated with the plurality of entities 106 from the one or more data sources 108. The set of data samples may be divided into training data set and validation data set. In one embodiment, the set of data samples may be represented in a dataset D as:
D={x,y}_m … Eqn. (1)
Where x represents data features, y represents data labels, and m represents a total number of data samples. The dataset D is further divided into disjoint training subset (i.e., training data set) and validation subset (i.e., validation data set) for each of the plurality of entities 106 (i.e., ‘n’ entities) (where entities may also represent financial entities 106a-106n (e.g., issuer, acquirer, etc.)) as:
〖D ={(D〗_1^train,D_1^val),〖(D〗_2^train,D_2^val)〖,…(D〗_n^train,D_n^val)} … Eqn. (2)
The main objective or purpose of the training is to minimize loss function or loss value (i.e. min┬θ ∑_i▒L(θ,D_i ) ) for each entity-specific NN model. For each of the plurality of entities 106, the processor 206 is configured to sample the dataset as:
〖(D〗_i^train,D_i^val)…Eqn. (3)
During the training phase, global neural network parameters 'θ' of the global neural network (NN) model 402 are initialized based on Xavier initialization, or any other similar initialization. Thereafter, local neural network parameters 'θi' of each of the plurality of entity-specific NN models 404a-404n are instanced based on the global neural network parameters. In one example, the processor 206 is configured to create the local neural network parameters θ1 for the first entity-specific NN model 404a, and the local neural network parameter θ2 for the second entity-specific NN model 404b.
More specifically, the processor 206 is configured to generate various copies of the global neural network model 402. Each copy of the global neural network model 402 is a separate entity-specific NN model of the plurality of entity-specific NN models 404a-404n.
At first, the processor 206 is configured to train the plurality of entity-specific NN models 404a-404n based, at least in part, on the corresponding training data set associated with each entity of the plurality of entities 106a-106n. With reference to the FIG. 1B, the processor 206 is configured to train the plurality of entity-specific NN models 404a-404n based, at least in part, on the corresponding financial training data set associated with each financial entity of the plurality of financial entities 126a-126n.
For each entity-specific NN model, the processor 206 is configured to update the value of θi based on the validation loss value calculated over the training data set D_i^train as:
θ_i=θ_i-α∇_(θ_i ) L_i … Eqn. (4)
θ_i=θ_i-α∇_(θ_i ) L(θ_i,D_i^train) … Eqn. (5)
Where, α is the step size. In general, step size denotes the learning rate of the underlying algorithm. The processor 206 is further configured to update the value of θi for n number of iterations, where n is a natural number.
Once the plurality of entity-specific NN models 404a-404n gets trained, the processor 206 is configured to access the validation data set associated with each entity from the one or more data sources 108. For training the global neural network model 402, the processor 206 is configured to pass the validation data set Dval through the updated local neural network parameters θi of each of the plurality of entity-specific NN models 404a-404n. After passing the validation dataset Dval, the processor 206 is configured to calculate validation loss value as:
Lv= L(θ_i,D_i^val) … Eqn. (6)
The processor 206 is configured to input the validation data set of each entity into the corresponding entity-specific NN model and is configured to calculate a validation loss value corresponding to each entity-specific NN model. In one example, the processor 206 is configured to calculate a first validation loss value L1 corresponding to the first entity-specific NN model 404a, a second validation loss value L2 corresponding to the second entity-specific NN model 404b, …, and n-th validation loss value Ln corresponding to the n-th entity-specific NN model 404n (as shown in FIG. 4).
The processor 206 is further configured to calculate a consolidated loss value Lval based on the combination of the weight values L1 to Ln. In one embodiment, the consolidated weight value Lval is equal to a weighted average of the combination of the weight values L1 to Ln. The consolidated weight value Lval is further back-propagated to the global neural network model 402 to calculate the consolidated weight value or to update the global neural network parameters of the global neural network model 402.
While calculating the consolidation loss value, the processor 206 is further configured to calculate the value of attention weights as Wi. To calculate the value of attention weight Wi, output Z of an entity-specific NN model is multiplied with a learnable parameter ϕ_i as:
f^(ϕ_i ) (Z〖)= ϕ〗_i Z^T … Eqn. (7)
Further, the Softmax function is used to calculate attention weights as:
w_i=e^(f^(ϕ_i ) (Z) )/(∑_i▒e^(f^(ϕ_i (Z)) ) ) … Eqn. (8)
Where w_i∈[0,1] s.t.∑w_i=1.
In one embodiment, the processor 206 is configured to update the parameters used to learn attention weights as:
ϕ=ϕ-γ∑_i▒〖W_i L(θ_i,D_i^val)〗 … Eqn. (9)
Where, 〖{ϕ}^n〗_(i=1)=ϕ and n is the number of the plurality of entities 106a-106n (e.g., acquirer, issuer etc.). In addition, L(θ_i,D_i^val ) acts as a constant as θ_i∩ ϕ_i=∅. The processor 206 is further configured to update the consolidated weight value θ of the global neural network model 402 based on the consolidated loss value as:
θ=θ-β∑_i▒〖W_i L(θ_i,D_i^val)〗 … Eqn. (10)
In one embodiment, the processor 206 is configured to fine-tune the local neural network parameters of each entity-specific NN model based on the consolidated weights of the global neural network model and an attention weight corresponding to each entity-specific NN model, with transfer learning methods.
FIG. 5 represents a flow chart 500 of a training phase for a large-scale neural network for multiple entities, in accordance with an embodiment of the present disclosure. The sequence of operations of the flow chart 500 may not be necessarily executed in the same order as they are presented. Further, one or more operations may be grouped and performed in form of a single step, or one operation may have several sub-steps that may be performed in parallel or in a sequential manner. It is to be noted that to explain the flow chart 500, references may be made to elements described in FIG. 1A and FIG. 2.
At 502, the server system 200 accesses a training data set associated with each entity of the plurality of entities 106 from the one or more data sources 108. In one example, the one or more data sources 108 may include data centers, repositories, data warehouses, a plurality of servers (e.g., application server, web server, media server, etc.), and the like.
In one embodiment, the plurality of entities 106 may include medical facilities (e.g., hospitals, laboratories, etc.), educational institutions, government agencies, telecom industries, and the like. In one example, the plurality of entities 106 may refer to entities for which the server system 200 is configured to perform the large-scale neural network training.
At 504, the server system 200 generates the plurality of data features corresponding to each entity based, at least in part, on the training data set.
At 506, the server system 200 scales the training data set based, at least in part on, the autoencoder 226. In particular, the server system 200 generates the input scalar value for the plurality of data features based, at least in part, on the training data set corresponding to each entity and the autoencoder 226. The server system 200 transmits the input scalar value as an input to the auto encoder 226. The autoencoder 226 is a specialized encoder that is configured to scale the plurality of data features. In one embodiment, the autoencoder 226 is fed with an identifier of the entity and data associated with the entity as the input scalar value. The server system 200 processes the input scalar value with the facilitation of the autoencoder 226 for scaling the plurality of data features for each entity of the plurality of entities 106a-106n.
At 508, the server system 200 initializes global neural network parameters of the global NN model 228. In one embodiment, the server system 200 performs initialization of the global neural network parameters with the facilitation of Xavier initialization or any other similar initialization.
At 510, the server system 200 instances local neural network parameters (e.g., weights and biases) of each entity-specific NN model of a plurality of entity-specific NN models 230 based, at least in part, on the global neural network parameters. In one embodiment, a number of the plurality of entity-specific NN models 230 depends upon a number of the plurality of entities 106a-106c. More specifically, the server system 200 is configured to create various copies of the global NN model 228. Each copy of the global NN model 228 is an entity-specific NN model that is configured to be trained based on the training data set associated with a specific entity.
At 512, the server system 200 trains the plurality of entity-specific NN models 230 based, at least in part, on the scaled data instances. In one embodiment, an entity-specific NN model is trained based on scaled data instances corresponding to the training data set of an entity.
At 514, the server system 200 accesses a validation data set associated with each entity from the one or more data sources 108. In one embodiment, the validation data set is also passed through the autoencoder 226 for generating the scaled data instances.
At 516, the server system 200 calculates validation loss values for the plurality of entity-specific NN models 230 based on a loss function.
At 518, the server system 200 determines an attention weight for each entity-specific NN model based on a parameter function of the entity-specific NN model.
In one embodiment, the server system 200 aggregates similar entity-specific NN models together based, at least in part, on calculated validation loss values for each of the plurality of entity-specific NN models 230. The similar entity-specific NN models are aggregated for identifying similar entities of the plurality of entities 106 showing similar data patterns. In one embodiment, the server system 200 is configured to identify a set of entities with a similar change in validation loss values of the entity-specific NN models 230 to aggregate learnings of the entity-specific NN models 230, thereby reducing computational complexity and space requirements.
In one embodiment, the server system 200 calculates cosine similarity score between gradients of the plurality of entity-specific NN models 230 to identify similar entity-specific NN models. The set of entities with a similar change in the validation loss values is identified based on the cosine similarity measure. In general, cosine similarity is used to measure similarity between two vectors by calculating the cosine of the angle between the two vectors. The information (e.g., name of similar entity-specific NN models) of the plurality of entity-specific NN models 230 is further saved in a memory. In the next iteration during the training phase, a single copy of the similar entity-specific NN models is created for aggregating the similar entity-specific NN models together.
At 520, the server system 200 calculates the consolidated loss value based, at least in part, on a combination of validation loss values calculated for the plurality of entity-specific NN models 230 and attention weights associated with the plurality of entity-specific NN models 230.
At 522, the server system 200 updates global neural network parameters of the global NN model 228 based, at least in part, on the consolidated loss value. The global NN model 228 is trained based on learnings of the plurality of entity-specific NN models 230 with the facilitation of transfer learning methods.
The sequence of steps of the flow chart 500 need not be necessarily executed in the same order as they are presented. Further, one or more operations may be grouped together and performed in form of a single step, or one operation may have several sub-steps that may be performed in parallel or a sequential manner.
FIG. 6 represents a flow chart 600 of a process flow of training an entity-specific NN model for a new entity, in accordance with an embodiment of the present disclosure. The sequence of operations of the flow chart 600 may not be necessarily executed in the same order as they are presented. Further, one or more operations may be grouped together and performed in form of a single step, or one operation may have several sub-steps that may be performed in parallel or a sequential manner. It is to be noted that to explain the flow chart 600, references may be made to elements described in FIG. 1A and FIG. 2.
As mentioned earlier, the server system 200 is configured to utilize learnings of other entities for training the entity-specific NN model for a new entity. Therefore, there is no need to train the entity-specific NN model from scratch. In one embodiment, the entity-specific NN model is created based on an instance of the global NN model 228.
At 602, the server system 200 may receive a request for training an entity-specific NN model for a new entity. In one embodiment, the server system 200 may receive the request from an administrator. In one example, the administrator may be any person, organization, or individual associated with the server system 200. The administrator may be responsible for upkeep and maintenance of the server system 200. The administrator may be responsible for troubleshooting the server system 200.
In one embodiment, the new entity is identical to the plurality of entities 106 for which the global NN model 228 is already being trained. In one example, the new entity may include a financial entity (e.g., acquirer, issuer, etc.), medical facility (e.g., hospitals, laboratories, etc.), non-banking financial company, educational institution, government agency, telecom industry, and the like.
At 604, the server system 200 accesses the training data set of the new entity from the one or more data sources 108.
At 606, the server system 200 scales the training data set based on the autoencoder 226. In one embodiment, the identifier of the new entity and data associated with the new entity is passed as an input to the autoencoder 226. The autoencoder 226 scales the training data for the new entity. More specifically, the autoencoder 226 reconstructs a plurality of data features for generating standardized and scaled data instances for the new entity.
At 608, the server system 200 initializes the entity-specific NN model based on global neural network parameters of the global NN model 228. The global NN model 228 is trained previously based on consolidated learnings of the plurality of entity-specific NN models 230.
At 610, the server system 200 trains the entity-specific NN model based on the scaled training data set.
At 612, the server system 200 fine-tunes the global neural network parameters (i.e., consolidated weight value) of the global NN model 228 based on a validation loss value for the entity-specific NN model of the new entity. In one embodiment, the server system 200 is configured to re-train the global NN model 228 based on learnings of the plurality of entity-specific NN models 230 after the addition of the entity-specific NN model of the new entity with facilitation of transfer learning methods.
FIG. 7 represents a flow chart 700 of a process flow of joint training of the plurality of financial entities 126a-126n, in accordance with an embodiment of the present disclosure. The sequence of operations of the flow chart 700 may not be necessarily executed in the same order as they are presented. Further, one or more operations may be grouped together and performed in form of a single step, or one operation may have several sub-steps that may be performed in parallel or in a sequential manner. It is to be noted that to explain the flow chart 700, references may be made to elements described in FIG. 1B and FIG. 2.
At 702, the server system 200 accesses the training data set associated with each financial entity of the plurality of financial entities 126a-126n from the one or more financial data sources 128. In one example, the one or more financial data sources 128 may include data centers, repositories, data warehouses, a plurality of servers (e.g., application server, web server, media server, etc.), and the like.
In one embodiment, the plurality of financial entities 126a-126n may include financial institutions (e.g., issuer, acquirer, etc.), non-banking financial companies, payment companies, and the like. In one example, the plurality of financial entities 126a-126n may refer to entities for which the server system 200 is configured to perform the large-scale neural network training.
At 704, the server system 200 generates the plurality of data features corresponding to each financial entity based, at least in part, on the training data set.
At 706, the server system 200 scales the training data set based, at least in part on, the autoencoder 226. In particular, the server system 200 generates the input scalar value for the plurality of data features based, at least in part, on the training data set corresponding to each financial entity and the autoencoder 226. The server system 200 transmits the input scalar value as an input to the autoencoder 226. The autoencoder 226 is a specialized encoder that is configured to scale the plurality of data features. In one embodiment, the autoencoder 226 is fed with an identifier of the financial entity and data associated with the financial entity as the input scalar value. The server system 200 processes the input scalar value with the facilitation of the autoencoder 226 for scaling the plurality of data features for each financial entity of the plurality of financial entities 126a-126n.
At 708, the server system 200 initializes global neural network parameters of the global NN model 228. In one embodiment, the server system 200 performs initialization of the global neural network parameters with the facilitation of Xavier initialization or any other similar initialization.
At 710, the server system 200 instances local neural network parameters (e.g., weights and biases) of each entity-specific NN model of a plurality of entity-specific NN models 230 based, at least in part, on the global neural network parameters. In one embodiment, a number of the plurality of entity-specific NN models 230 depends upon a number of the plurality of financial entities 126a-126c. More specifically, the server system 200 is configured to create various copies of the global NN model 228. Each copy of the global NN model 228 is an entity-specific NN model that is configured to be trained based on the training data set associated with a specific financial entity.
At 712, the server system 200 trains the plurality of entity-specific NN models 230 based, at least in part, on the scaled data instances. In one embodiment, each entity-specific NN model is trained based on the scaled data instances corresponding to the training data set of each financial entity of the plurality of financial entities 126a-126c.
At 714, the server system 200 accesses a validation data set associated with each financial entity from the one or more financial data sources 128. In one embodiment, the validation data set is also passed through the autoencoder 226 for generating the scaled data instances.
At 716, the server system 200 calculates validation loss values for the plurality of entity-specific NN models 230 based on a loss function.
At 718, the server system 200 determines attention weight for each entity-specific NN model based, at least in part, on a parameter function of the entity-specific NN model.
In one embodiment, the server system 200 aggregates similar entity-specific NN models together based, at least in part, on calculated validation loss values for each of the plurality of entity-specific NN models 230. The similar entity-specific NN models are aggregated for identifying similar financial entities of the plurality of financial entities 126a-126n showing similar data patterns.
At 720, the server system 200 calculates the consolidated loss value based, at least in part, on a combination of validation loss values calculated for the plurality of entity-specific NN models 230 and attention weights associated with the plurality of entity-specific NN models 230.
At 722, the server system 200 updates global neural network parameters of the global NN model 228 based, at least in part, on the consolidated loss value. The global NN model 228 is trained based on learnings of the plurality of entity-specific NN models 230 with the facilitation of transfer learning methods.
Thus, an entity-specific neural network model associated with a particular financial entity is configured to predict network scores of cardholders associated with the particular financial entity while generalizing learnings of the entity-specific neural network model from the plurality of financial entities.
The sequence of steps of the flow chart 700 need not be necessarily executed in the same order as they are presented. Further, one or more operations may be grouped together and performed in form of a single step, or one operation may have several sub-steps that may be performed in parallel or a sequential manner.
FIG. 8 illustrates a flow diagram depicting a method 800 for large-scale neural network training for multiple entities, in accordance with an embodiment of the present disclosure. The method 800 depicted in the flow diagram may be executed by, for example, the server system 200. Operations of the method 800, and combinations of operation in the method 800, may be implemented by, for example, hardware, firmware, a processor, circuitry, and/or a different device associated with the execution of software that includes one or more computer program instructions. The operations of the method 800 are described herein may be performed by an application interface that is hosted and managed with help of the server system 200. The method 800 starts at operation 802.
At operation 802, the method 800 includes accessing, by the server system 200, training data set associated with each entity of the plurality of entities 106 from the one or more data sources 108.
At operation 804, the method 800 includes initializing, by the server system 200, global neural network parameters of the global NN model 228.
At operation 806, the method 800 includes instancing, by the server system 200, local neural network parameters of each entity-specific NN model of the plurality of entity-specific NN models 230 based, at least in part, on the global neural network parameters.
At operation 808, the method 800 includes training, by the server system 200, the plurality of entity-specific NN models 230 based, at least in part, on the corresponding training data set associated with each entity of the plurality of entities 106a-106c.
At operation 810, the method 800 includes accessing, by the server system 200, a validation data set associated with each entity from the one or more data sources 108.
At operation 812, the method 800 includes calculating, by the server system 200, the consolidated loss value based, at least in part, on the combination of validation loss values calculated for the plurality of entity-specific NN models 230 and attention weights associated with the plurality of entity-specific NN models 230.
At operation 814, the method 800 includes updating, by the server system 200, global neural network parameters of the global NN model 228 based, at least in part, on the consolidated loss value. The global NN model 228 is trained based on learnings of the plurality of entity-specific NN models 230 with the facilitation of transfer learning methods.
The sequence of operations of the method 800 need not be necessarily executed in the same order as they are presented. Further, one or more operations may be grouped together and performed in form of a single step, or one operation may have several sub-steps that may be performed in parallel or a sequential manner.
Without limiting the scope of the present disclosure, the one or more example embodiments disclosed herein provide methods and systems for training large-scale neural network models. The server system 200 trains the plurality of entity-specific neural network (NN) models for the plurality of entities. In addition, the server system instances the global neural network model based, at least in part, on the plurality of entity-specific neural network models. The server system reduces computational complexity during the addition of the new-entity-specific NN model to the plurality of entity-specific NN models while implementation of the global neural network model.
The disclosed methods with reference to FIGS. 1A to 8, or one or more operations of the methods 500, 600, 700, and 800 may be implemented using software including computer-executable instructions stored on one or more computer-readable media (e.g., non-transitory computer-readable media, such as one or more optical media discs, volatile memory components (e.g., DRAM or SRAM), or nonvolatile memory or storage components (e.g., hard drives or solid-state nonvolatile memory components, such as Flash memory components) and executed on a computer (e.g., any suitable computer, such as a laptop computer, netbook, Webbook, tablet computing device, smartphone, or other mobile computing devices). Such software may be executed, for example, on a single local computer or in a network environment (e.g., via the Internet, a wide-area network, a local-area network, a remote web-based server, a client-server network (such as a cloud computing network), or other such networks) using one or more network computers. Additionally, any of the intermediate or final data created and used during implementation of the disclosed methods or systems may also be stored on one or more computer-readable media (e.g., non-transitory computer-readable media) and are considered to be within the scope of the disclosed technology. Furthermore, any of the software-based embodiments may be uploaded, downloaded, or remotely accessed through a suitable communication means. Such suitable communication means include, for example, the Internet, the World Wide Web, an intranet, software applications, cable (including fiber optic cable), magnetic communications, electromagnetic communications (including RF, microwave, and infrared communications), electronic communications, or other such communication means.
Although the disclosure has been described with reference to specific exemplary embodiments, it is noted that various modifications and changes may be made to these embodiments without departing from the broad spirit and scope of the disclosure. For example, the various operations, blocks, etc. described herein may be enabled and operated using hardware circuitry (for example, complementary metal-oxide semiconductor (CMOS) based logic circuitry), firmware, software and/or any combination of hardware, firmware, and/or software (for example, embodied in a machine-readable medium). For example, the apparatuses and methods may be embodied using transistors, logic gates, and electrical circuits (for example, application-specific integrated circuit (ASIC) circuitry and/or in Digital Signal Processor (DSP) circuitry).
Particularly, the server system 200 (e.g., the server system 102) and its various components such as the computer system 202 and the database 204 may be enabled using software and/or using transistors, logic gates, and electrical circuits (for example, integrated circuit circuitry such as ASIC circuitry). Various embodiments of the disclosure may include one or more computer programs stored or otherwise embodied on a computer-readable medium, wherein the computer programs are configured to cause a processor or computer to perform one or more operations. A computer-readable medium storing, embodying, or encoded with a computer program, or similar language, may be embodied as a tangible data storage device storing one or more software programs that are configured to cause a processor or computer to perform one or more operations. Such operations may be, for example, any of the steps or operations described herein. In some embodiments, the computer programs may be stored and provided to a computer using any type of non-transitory computer-readable media. Non-transitory computer-readable media include any type of tangible storage media. Examples of non-transitory computer-readable media include magnetic storage media (such as floppy disks, magnetic tapes, hard disk drives, etc.), optical magnetic storage media (e.g., magneto-optical disks), CD-ROM (compact disc read-only memory), CD-R (compact disc recordable), CD-R/W (compact disc rewritable), DVD (Digital Versatile Disc), BD (BLU-RAY® Disc), and semiconductor memories (such as mask ROM, PROM (programmable ROM), EPROM (erasable PROM), flash memory, RAM (random access memory), etc.). Additionally, a tangible data storage device may be embodied as one or more volatile memory devices, one or more non-volatile memory devices, and/or a combination of one or more volatile memory devices and non-volatile memory devices. In some embodiments, the computer programs may be provided to a computer using any type of transitory computer-readable media. Examples of transitory computer-readable media include electric signals, optical signals, and electromagnetic waves. Transitory computer-readable media can provide the program to a computer via a wired communication line (e.g., electric wires, and optical fibers) or a wireless communication line.
Various embodiments of the invention, as discussed above, may be practiced with steps and/or operations in a different order, and/or with hardware elements in configurations, which are different than those which are disclosed. Therefore, although the invention has been described based upon these exemplary embodiments, it is noted that certain modifications, variations, and alternative constructions may be apparent and well within the spirit and scope of the invention.
Although various exemplary embodiments of the invention are described herein in a language specific to structural features and/or methodological acts, the subject matter defined in the appended claims is not necessarily limited to the specific features or acts described above. Rather, the specific features and acts described above are disclosed as exemplary forms of implementing the claims.
| # | Name | Date |
|---|---|---|
| 1 | 202141055908-STATEMENT OF UNDERTAKING (FORM 3) [02-12-2021(online)].pdf | 2021-12-02 |
| 2 | 202141055908-POWER OF AUTHORITY [02-12-2021(online)].pdf | 2021-12-02 |
| 3 | 202141055908-FORM 1 [02-12-2021(online)].pdf | 2021-12-02 |
| 4 | 202141055908-FIGURE OF ABSTRACT [02-12-2021(online)].jpg | 2021-12-02 |
| 5 | 202141055908-DRAWINGS [02-12-2021(online)].pdf | 2021-12-02 |
| 6 | 202141055908-DECLARATION OF INVENTORSHIP (FORM 5) [02-12-2021(online)].pdf | 2021-12-02 |
| 7 | 202141055908-COMPLETE SPECIFICATION [02-12-2021(online)].pdf | 2021-12-02 |
| 8 | 202141055908-Proof of Right [15-01-2022(online)].pdf | 2022-01-15 |
| 9 | 202141055908-Correspondence_Assignment_07-03-2022.pdf | 2022-03-07 |
| 10 | 202141055908-Correspondence_General Power of Attorney_26-04-2022.pdf | 2022-04-26 |
| 11 | 202141055908-FORM 18 [20-11-2025(online)].pdf | 2025-11-20 |