Abstract: The present disclosure relates to method for identifying one or more fraudulent intermediary agents among a plurality of intermediary agents of an institution. The plurality intermediary agents provide credit services to a plurality of end users on behalf of the institution. The method comprises receiving a plurality of primary attributes related to each of the intermediary agents and storing the plurality of primary attributes in a database. Further, a plurality of secondary attributes indicative of conduct of each of the intermediary agents are determined based on exploratory data analysis of the plurality of primary attributes and score indicative of fraudulent or non-fraudulent behaviour of each of the intermediary agents is generated by processing the plurality of secondary attributes. Based on the generated score for each of the plurality of intermediary agents, one or more fraudulent intermediary agents among the plurality of intermediary agents are identified. [Fig. 3]
FORM 2
THE PATENTS ACT, 1970
(39 of 1970)
&
THE PATENTS RULES, 2003
COMPLETE SPECIFICATION (See section 10, rule 13)
“SYSTEN AND METHOD TO IDENTIFY FRAUDULENT INTERMEDIARY AGENTS OF AN INSTITUTION”
Name and address of the applicant:
a) Name: MASTEK LIMITED
b) Nationality: INDIAN
c) Address: Mastek Millennium Center, A-7, Millennium Business Park, Off Thane Belapur Road, Mahape, Navi Mumbai - 400 710, India
The following specification particularly describes the invention and the manner in which it is
to be performed.
[0001] PREAMBLE TO THE DESCRIPTION
[0002] The following specification particularly describes the invention and the manner in
which it is to be performed:
[0003] TECHNICAL FIELD
[0004] The present disclosure relates to banking system. More particularly, the present disclosure describes the system and method for identification of fraudulent behaviour of intermediary agents of a banking institution.
[0005] BACKGROUND
[0006] The following description includes information that may be useful in understanding the present invention. It is not an admission that any of the information provided herein is prior art or relevant to the presently claimed invention, or that any publication specifically or implicitly referenced is prior art.
[0007] Typically banking institution such as micro finance institutions provide unsecured personal or consumer loans to customers in different markets. A key line of banking business is home credit cash loans, which is managed through a network of intermediary agents. The intermediary agents sell loans to customers and collect payment from the customers on regular basis. However, some agents may commit fraud by opening fictitious loans or pocketing customer repayments. The agents who commit fraud, count on bank writing off these fraudulent activities alongside bad loans, which altogether represent a significant loss for such bank annually.
[0008] Typically, a team of fraud detection associates are assigned to identify agent fraud by manually reviewing repayment collections data, which may be collated in Excel spreadsheets. This process is labour intensive, time-consuming and less accurate than desired to support Bank’s efforts to combat agent fraud. In addition, it is difficult to scale such a manual process in line with business expansion.
[0009] Thus, there exists a need in the art for a robust system and technique to identify fraudulent intermediary agents among a plurality of intermediary agents of the banking institution.
[0010] SUMMARY
[0011] The present disclosure overcomes one or more shortcomings of the prior art and provides additional advantages. Embodiments and aspects of the disclosure described in detail herein are considered a part of the claimed disclosure.
[0012] A system of one or more computers can be configured to perform particular
operations or actions by virtue of having software, firmware, hardware, or a combination of them installed on the system that in operation causes or cause the system to perform the actions. One or more computer programs can be configured to perform particular operations or actions by virtue of including instructions that, when executed by data processing apparatus, cause the apparatus to perform the actions. One general aspect includes a system to identify one or more fraudulent intermediary agents among a plurality of intermediary agents of an institution. The system also includes a receiving unit configured to receive a plurality of primary attributes related to each of the intermediary agents. The system also includes a database operatively coupled to the receiving unit and configured to store the plurality of primary attributes; and a processing unit may include a pre-trained model operatively coupled to the receiving unit and the database, and configured to: determine a plurality of secondary attributes indicative of conduct of each of the intermediary agents based on exploratory data analysis of the plurality of primary attributes; generate a score indicative of fraudulent or non-fraudulent behaviour of each of the intermediary agents by processing the plurality of secondary attributes; and identify the one or more fraudulent intermediary agents among the plurality of intermediary agents based on the generated score for each of the plurality of intermediary agents. Other embodiments of this aspect include corresponding computer systems, apparatus, and computer programs recorded on one or more computer storage devices, each configured to perform the actions of the methods.
[0013] Implementations may include one or more of the following features. The
plurality of primary attributes for each of the intermediary agents may include one or more of: demographics details, outstanding credit value, expected credit collection value during a first time period, actual credit collection value, a number of new credit accounts, a total credit value of the new credit accounts, and a value of available credit reserve. The exploratory data analysis of the plurality of primary attributes to determine the plurality of secondary attributes may include analysing one or more of credit collection patterns the plurality of agents, current and historical credit collection per agent, and actual credit collection in comparison to expected
credit collection per agent. The plurality of secondary attributes may include: outstanding credit value, regular credit recovery deviation from institution mean, zero credit recovery deviation from institution mean, regular credit recovery percentage over a second time period, zero credit recovery percentage over the second time period, float over the second time period, total credit value of the new credit accounts in a region of the institution, mean of percentage of credit collected, institution wise mean of percentage of credit collected over a first time period, institution wise zero credit recovery percentage mean, institution wise sum of the new credit accounts, a ratio of the total credit value of the new credit accounts and a number of the new credit accounts. The pre-trained model is trained based on historical primary and secondary attributes, and where to train the pre-trained model, the processing unit is configured to: receive one or more datasets related to a plurality of intermediary agents from one or more of databases, where the datasets may include a plurality of historical primary attributes associated with each of the plurality of intermediary agents, where the plurality of intermediatory agents may include a set of fraudulent intermediary agents and a set of non-fraudulent intermediary agents; analyse the plurality of historical primary attributes to determine a plurality of historical secondary attributes associated with each of the set of fraudulent intermediary agents and the set of non-fraudulent intermediary agents; determine a threshold score indicative of a fraudulent behaviour of the intermediary agents based on the plurality of historical secondary attributes associated with each of the set of fraudulent intermediary agents and the set of non-fraudulent intermediary agents; and train the model, based on the determined threshold score and the plurality of historical secondary attributes associated with each of the set of fraudulent intermediary agents and the set of non-fraudulent intermediary agents, to identify one or more fraudulent intermediary agents in real-time. The score is a value between 0 to 1, and the score more than the threshold score indicates the fraudulent behaviour. The processing unit is configured to train the model for detecting false positives and false negatives. The processing unit is configured to determine a pattern between the one or more intermediary agents vis-a¬vis one or more end user, based on past detection or past suspicious behaviour. The processing unit is configured to store the secondary attributes in the database. The method as claimed further may include storing the secondary attributes in the database. Implementations of the described techniques may include hardware, a method or process, or computer software on a computer-accessible medium.
[0014] One general aspect includes a method for identifying one or more fraudulent
intermediary agents among a plurality of intermediary agents of an institution. The method also
includes receiving a plurality of primary attributes related to each of the intermediary agents. The method also includes storing the plurality of primary attributes in a database; determining a plurality of secondary attributes indicative of conduct of each of the intermediary agents based on exploratory data analysis of the plurality of primary attributes. The method also includes generating a score indicative of fraudulent or non-fraudulent behaviour of each of the intermediary agents by processing the plurality of secondary attributes. The method also includes identifying the one or more fraudulent intermediary agents among the plurality of intermediary agents based on the generated score for each of the plurality of intermediary agents. Other embodiments of this aspect include corresponding computer systems, apparatus, and computer programs recorded on one or more computer storage devices, each configured to perform the actions of the methods.
[0015] Implementations may include one or more of the following features. The
method as claimed where the plurality of primary attributes for each of the intermediary agents may include one or more of: demographics details, outstanding credit value, expected credit collection value during a first time period, actual credit collection value, a number of new credit accounts, a total credit value of the new credit accounts, and a value of available credit reserve. The exploratory data analysis of the plurality of primary attributes to determine the plurality of secondary attributes may include analysing one or more of credit collection patterns the plurality of agents, current and historical credit collection per agent, and actual credit collection in comparison to expected credit collection per agent. The plurality of secondary attributes may include: outstanding credit value, regular credit recovery deviation from institution mean, zero credit recovery deviation from institution mean, regular credit recovery percentage over a second time period, zero credit recovery percentage over the second time period, float over the second time period, total credit value of the new credit accounts in a region of the institution, mean of percentage of credit collected, institution wise mean of percentage of credit collected over a first time period, institution wise zero credit recovery percentage mean, institution wise sum of the new credit accounts, a ratio of the total credit value of the new credit accounts and a number of the new credit accounts. The pre-trained model is trained based on historical primary and secondary attributes, and where training of the pre-trained model may include: receiving one or more datasets related to a plurality of intermediary agents from one or more of databases, where the datasets may include a plurality of historical primary attributes associated with each of the plurality of intermediary agents, where the plurality of
intermediatory agents may include a set of fraudulent intermediary agents and a set of non-fraudulent intermediary agents; analysing the plurality of historical primary attributes to determine a plurality of historical secondary attributes associated with each of the set of fraudulent intermediary agents and the set of non-fraudulent intermediary agents; determining a threshold score indicative of a fraudulent behaviour of the intermediary agents based on the plurality of historical secondary attributes associated with each of the set of fraudulent intermediary agents and the set of non-fraudulent intermediary agents; and training the model, based on the determined threshold score and the plurality of historical secondary attributes associated with each of the set of fraudulent intermediary agents and the set of non-fraudulent intermediary agents, to identify one or more fraudulent intermediary agents in real-time. The score is a value between 0 to 1; and the score more than the threshold score indicates the fraudulent behaviour. The method as claimed further may include training the model for detecting false positives and false negatives. The method as claimed may include: determining a pattern between the one or more intermediary agents vis-a-vis one or more end user, based on past detection or past suspicious behaviour.
[0016] Implementations of the described techniques may include hardware, a method or process, or computer software on a computer-accessible medium.
[0017] The foregoing summary is illustrative only and is not intended to be in any way limiting. In addition to the illustrative aspects, embodiments, and features described above, further aspects, embodiments, and features will become apparent by reference to the drawings and the following detailed description
[0018] BRIEF DESCRIPTION OF DRAWINGS
[0019] The features, nature, and advantages of the present disclosure will become more apparent from the detailed description set forth below when taken in conjunction with the drawings in which like reference characters identify correspondingly throughout. Some embodiments of system and/or methods in accordance with embodiments of the present subject matter are now described, by way of example only, and with reference to the accompanying Figs., in which:
[0020] Fig. 1a illustrates a block diagram illustrating a system to identify one or more fraudulent intermediary agents, in accordance with an embodiment of the present disclosure.
[0021] Fig. 1b illustrates a block diagram illustrating training of model for fraud detection, in accordance with an embodiment of the present disclosure.
[0022] Fig. 2a-2g illustrate boxplot comparison of behaviour of fraudulent and non-fraudulent agents in accordance with an embodiment of the present disclosure.
[0023] Fig. 3 illustrates an exemplary flow of operation for identifying one or more fraudulent intermediary agents, in accordance with an embodiment of the present disclosure.
[0024] It should be appreciated by those skilled in the art that any block diagrams herein represent conceptual views of illustrative systems embodying the principles of the present subject matter. Similarly, it will be appreciated that any flow charts, flow diagrams, state transition diagrams, pseudo code, and the like represent various processes which may be substantially represented in a computer readable medium and executed by a computer or processor, whether or not such computer or processor is explicitly shown.
[0025] OBJECT OF THE INVENTION:
[0026] The primary object of the present invention is to provide an automated system to enable fraud detection.
[0027] Another object of the present invention is to provide an automated fraud detection system which provides a common risk assessment approach.
[0028] Further object of the present invention is to provide an automated fraud detection system which helps to reduce the bad loans.
[0029] DETAILED DESCRIPTION
[0030] The foregoing has broadly outlined the features and technical advantages of the present
disclosure in order that the detailed description of the disclosure that follows may be better
understood. It should be appreciated by those skilled in the art that the conception and specific
embodiment disclosed may be readily utilized as a basis for modifying or designing other
structures for carrying out the same purposes of the present disclosure.
[0031] The novel features which are believed to be characteristic of the disclosure, both as to
its organization and method of operation, together with further objects and advantages will be
7
better understood from the following description when considered in connection with the accompanying figures. It is to be expressly understood, however, that each of the figures is provided for the purpose of illustration and description only and is not intended as a definition of the limits of the present disclosure.
[0032] The present disclosure related to banking industry and particularly, to microfinance industry which provides credit services such as loans. These loans are very low in the amount and with short tenure. Generally, when the loan is issued, the institution performs a detailed analysis of who is borrowing, for what reason they are borrowing, have they borrowed earlier before giving the loan? Doing all this analysis is not possible in microfinance, because the loan amount is only very small. The whole microfinance industry works in this manner. The other thing is many of the borrowers might not even have this kind of a payment history because they are from a very low income groups in the society. Because these low income groups do not have any sort of payment history or any means of demonstrating their income, it becomes difficult for them to get loans from big banking institutions.
[0033] Thus, the problem is that the microfinance industry issues the small loans to people who do not have any loan history for basic reasons such as their income is so low. These kind of borrowers may not physically get to the banks to get these kind of small loans so they have to go through a set of agents. These agents typically go from door to door to these borrowers i.e., customers. The agents go to the customers to recover the amount that they were supposed to recover in that time period. The time period may be week or month. Generally, these payments happen on a weekly basis. These agents collects payment for a period of entire week and end of the week or a pre-decided date, thereafter the agents will go to the bank branch to which he is associated and will reconcile. It is possible that some borrowers would have paid back their weekly money and some borrowers might want to renew their loan, in which case he will disburse that money and he will handle this cash accordingly. At the end of the week/certain time period, the agent goes to the bank branch and provides records of all the recollections and disbursement. So after certain time period, the agent goes back to the branch and reconcile the records. This the first digital trace of whatever transactions has happened. The bank maintain the per agent per week records. The record may be in excel format, where every row is about one agent one week.
[0034] According to the present disclosure, there are multiple types of frauds that can happen i.e., the borrower may just dump the loan, or the agent creates a fictitious loan (while there is
no borrower on the other side, the agent may create a fictitious loan. He gets money from the bank). Now in order to mask the fraud, the agent may do initial few repayments out of total instalments. He may just repay first two or three instalments and thereafter he may just stop repaying. That’s the fraud that the present disclosure more particularly focuses because by volume this kind of fraud is very high in the microfinance companies, but not limited thereto.
[0035] Conventionally, in order to detect the fraud, the records of every agent is manually reviewed by an analysis team to detect change in the pattern of credit recovery that the agent deposited at the bank. The change in the patten indicates the possibility that the agent has committed the fraud. Upon detection, the further enquiry may be conducted against the agent. However, this process is very centric to geography because the analysis team present in various geographies are not connected. Thus, this is region specific process. Further, it is extremely cumbersome for any human to go through a large data sets and just look at the data and figure out what could be a fraud. Furthermore, the analysis team may also provide results with less accuracy of the prediction. Furthermore, it is quite possible that the analysis team becomes biased towards certain set of agents over the others.
[0036] The above-mentioned problems are overcome by the fraud detection system which automatically detect if certain agent based on determining patterns. Fig. 1a illustrates a block diagram of the fraud detection system according to the an embodiment of the present disclosure. The system 100 may comprise a receiving unit 102, a database 104, and a processing unit 106, but not limited thereto. These units 102-106 may be communicatively and operatively coupled to each other and can exchange information with each other via wired or wireless channel. The processing unit 106 comprises a model 108 which is pre-trained to detect the fraudulent agent.
[0037] The system 100 may identify one or more fraudulent intermediary agents among a plurality of intermediary agents of an institution. The plurality intermediary agents provide credit services to a plurality of end users on behalf of the institution. According to an embodiment, the receiving unit 102 may receive a dataset related to a plurality of agents, which comprises a plurality of primary attributes related to each of the intermediary agents. The received primary attributes may be stored in the database 104. In an embodiment, the plurality of primary attributes for each of the intermediary agents comprises one or more of: demographics details, outstanding credit value, expected credit collection value during a first
time period, which may be a pre-decided time period such a week/month, but not limited thereto, actual credit collection value, a number of new credit accounts, a total credit value of the new credit accounts, and a value of available credit reserve, but not limited thereto.
[0038] The system considers the demographics details of the agents such as name, age, family size, occupation, etc. before they got into this business or any occupation that they have alongside, but not limited thereto. The system also considers the pattern of the agent on a period basis such as on monthly basis or a week on week basis, but not limited thereto. For example, typically every week, if the agent repays X amount. In a particular week, if the agent pays 50% more than X, or just 25% of X. That is a shift in the agents repayment pattern, which may be an indication that perhaps the agent is actually committing a fraud and a deeper enquiry with this particular agent is required.
[0039] Referring again to fig. 1a, after receiving the primary attributes, the system 100 may analyse various additional factors/secondary attributes such as outstanding balance on agents’ own bank loans, repayment pattern of all agents in the region with bank (institution) loans, current and historical loan recovery value per agent, and actual amount recovered compared with anticipated amount per agent, etc., but not limited thereto. Particularly, after receiving the primary attributes, the processing unit 106 determines a plurality of secondary attributes. The secondary attributes are indicative of conduct of each of the intermediary agents based on exploratory data analysis of the plurality of primary attributes. The plurality of secondary attributes particularly may comprise: outstanding credit value, regular credit recovery deviation from institution mean, zero credit recovery deviation from institution mean, regular credit recovery percentage over a second time period (greater than the first time period), zero credit recovery percentage over the second time period, float over the second time period, total credit value of the new credit accounts in the region of the institution, mean of percentage of credit collected, institution wise mean of percentage of credit collected over a first time period, institution wise zero credit recovery percentage mean, institution wise sum of the new credit accounts, a ratio of the total credit value of the new credit accounts and a number of the new credit accounts, but not limited thereto. However, before determining the secondary attributes, the processing unit 106 may perform data pre-processing.
[0040] In an embodiment, the system 100 may consider a shift from an average credit recovery for a particular branch to identify the fraudulent behaviour of the agent. However, it is possible
that one branch might have a number of agents associated with them such as 100 or 150 agents, but not limited thereto. All these agents may be actually working in nearby geographies, nearby regions, nearby villages, where the standard of living is typically the same. Now, out of all agents associated with that branch, only a few agents may show a very different payment in a certain week, compared to rest of the agents, thus it is quite likely that there is something fishy going on with these few agents. In this manner, the system 100 may detect anomaly in the dataset which is indicative of the fraud.
[0041] Furthermore, it is possible that the agents themselves can become borrowers. If they have borrowed a large amount of money on their own account, then the chances that they will commit a fraud are very high. Thus, the system 100 may consider the outstanding credit value as important parameter. As shown in fig. 2a, which shows that the outstanding credit value on agents own account. The institution allows agents to take loan on their account if they need to. When the outstanding balance on the agent’s own account is high, they may commit fraud to repay that loan. Fig. 2a shows as boxplot comparison of the fraud and non-fraud cases, which clearly demonstrates that fraud cases have a much higher median for the variable ‘Outstanding Credit Value On Agents Own Account’. As shown in fig.2a, in the non-fraud case the balance outstanding credit value on the agent’s own account is very low on an average and in case of fraud, it is very high. This represents that outstanding balance on agents own account is important for predicting if a certain agent is committing fraud.
[0042] Further, when an agent commits fraud by not depositing the collected credit amount to the institution, that agent may be identified by comparing credit recovery by the fraudulent agent with others agents. As shown in fig. 2b-2c, in fraud cases the regular credit recovery deviates more from institution branch mean and from agent’s himself credit recovery over a time period. Fig. 2b illustrates a boxplot comparison of the regular credit collection deviation from institution mean for fraud and non-fraud cases. Generally, all the agents associated with certain branch of institution typically repay/deposit the collected credit similar amounts in a given time period i.e. week/month. However, if an agent is committing fraud by not depositing the collected credit amount to the institution, he may be identified by comparing the agent who is committing fraud with others agents, it can be seen in fig. 2b that fraud cases deviate more from institution branch mean. This indicates that deposited amounts in case of frauds varies more than non-frauds. As shown in fig. 2b, the less deviation means the agents are paying back almost similar to what rest of the agents in their institution are doing, but in the fraud cases that
deviation is more. Further, Fig. 2c illustrates a boxplot comparison of the regular credit collection deviation over a time period for fraud and non-fraud cases, which shows that in the fraud cases that deviation in the regular pay is more.
[0043] Furthermore, fig. 2d-2e illustrate boxplot comparisons of zero credit recovery deviation from institution mean and zero credit recovery deviation a time period that may be 8 weeks or more, but not limited thereto, respectively, which is lower in case of frauds. This means that the agents who commit fraud will have fewer zero credit recovery days compared to those who you know who are not committing a fraud, because the fraudulent agents synthetically create a sort of a posture that they are not committing fraud and they will tend to mask even though the cases where there is zero recovery.
[0044] Further, for disbursement of loans, the agents may keep certain amount of credit which is also called float. In the fraud cases show it is observed that such float averaged over a time period (e.g. for 8 weeks, but not limited thereto) is higher. This is because the agents keep more credit in hand when they intend to commit a fraud, as shown in fig. 2f. Fig. 2f illustrates a boxplot comparison of floating amount for fraud and non-fraud cases. In an embodiment, in the fraud cases, the ratio of the total credit value of the new credit accounts and the number of the new credit accounts is higher than the ratio of the total credit value of the new credit accounts and the number of the new credit accounts in non-fraud cases, as shown in fig. 2g. Fig. 2g illustrates a boxplot comparison of a ratio of the total credit value of the new credit accounts and the number of the new credit accounts for fraud and non-fraud cases.
[0045] Referring back to fig. 1a, upon determination of the plurality of secondary attributes, the processing unit 106 may generate a score indicative of fraudulent or non-fraudulent behaviour of each of the intermediary agents by processing the plurality of secondary attributes. Particularly, the processing unit 106 uses the pre-trained model to generate the score for an agent based on the secondary attributes. The score indicates whether any agent is fraudulent or not. After generating the score, the processing unit 106 may identify the one or more fraudulent intermediary agents among the plurality of intermediary agents based on the generated score for each of the plurality of intermediary agents. In order to identify the one or more fraudulent agents based on the score, the processing unit 106 may compare the generate score with a threshold score. According to an embodiment, the score is a value between 0 to 1, but not limited thereto. Further, the score more than the threshold score indicates the fraudulent
behaviour. In an example, the threshold score may be 0.50, but not limited thereto. In an exemplary embodiment, each of the secondary attributes may be assigned a weight based on the importance/criticality of the attribute for detection of the fraudulent agent and the score may be generated by calculating an aggregated value of all the secondary attributes based on their respective weights.
[0046] Further, the processing unit 106 may determine a pattern between the one or more intermediary agents vis-a-vis one or more end user, based on past detection or past suspicious behaviour. it is possible that a pair of user and agent in duo is committing fraud. Thus, such pattern determination is helpful to prevent fraud by agent/user.
[0047] In this manner, the system is able to identify the fraudulent agents. The automated agent fraud detection system disclosed here supports the institution’s goals of reducing write-offs, and therefore operating costs helping the company to maintain profit margins and continue delivering shareholder value. These reductions help bank to: continue providing customers who are underbanked or underserved by mainstream credit operators with loans and lines of credit that are delivered responsibly and transparently, and maintain its positioning against payday, digital, bank operators and other home credit providers competing to serve credit to the same customer segment.
[0048] Further, the traditional approach of hand-coding rules for fraud detection doesn't allow a fraud detection engine to learn from new data. In contrast, the disclosed automated system significantly improves the sustainability. Additionally, the automated fraud detection system works without any human bias, while augmenting the capabilities of the bank fraud detection associates using it. The proportion of fraud per country and the influencing factors were previously unknown. Implementing the automated agent fraud detection system provides bank with greater insight into both areas.
[0049] Again referring to Fig. 1a, the model 108 may be pre-trained based on historical primary and secondary attributes. To train the model 108, the processing unit 106 may receive one or more datasets related to a plurality of intermediary agents from one or more of databases, where the datasets comprises a plurality of historical primary attributes associated with each of the plurality of intermediary agents. The plurality of intermediatory agents comprises a set of fraudulent intermediary agents and a set of non-fraudulent intermediary agents. The processing unit 106 may analyse the plurality of historical primary attributes to determine a plurality of
historical secondary attributes associated with each of the set of fraudulent intermediary agents and the set of non-fraudulent intermediary agents. Further, the processing unit 106 may determine a threshold score indicative of a fraudulent behaviour of the intermediary agents based on the plurality of historical secondary attributes associated with each of the set of fraudulent intermediary agents and the set of non-fraudulent intermediary agents and train the model, based on the determined threshold score and the plurality of historical secondary attributes associated with each of the set of fraudulent intermediary agents and the set of non-fraudulent intermediary agents, to identify one or more fraudulent intermediary agents in real¬time. In order to test the trained model, a test dataset is used. The model is tested using 80% of the of test data, not on the entire data and the rest 20% is used for validation. Further, the model is also trained for detecting false positives and false negatives in real-time, which further improves the accuracy.
[0050] The training process may be easily understood by fig. 1b which shows the solution approach of fraud detection. As shown, the training dataset 110, which comprises patterns i.e., fraudulent and/or non-fraudulent behaviour patterns, may be provided to the and analysed by the AI/ML algorithms 112. In an embodiment, the training data 110 may comprise customer (agent) demography details, transaction details, etc. along with tag of fraud/non-fraud for each transaction/agent to train the model, but not limited thereto. The AI/ML algorithms 112 may find the patterns which distinguishes the fraudulent behaviour with non-fraudulent behaviour and the model 108 may trained to recognise the patterns. In an embodiment, the model 108 may developed using a cloud based GUI such as Azure Machine learning studio, but not limited thereto. As a process of training, the model 108 may be tested using the application of test dataset 114. The output 116 of the model 108 is reviewed to determine the accuracy and the generated output may be used a feedback to improve the accuracy and for further training of the model 108 so that the model 108 may provide a list of fraudulent agents in the real-time. The output 116 of the model 108 may be in form of a list of potential fraudulent agents which may be further investigated.
[0051] In this manner, the model 108 may be used in the fraud identification process and to provide a common risk assessment approach. Further, model 108 is able to reduce the bad loans and reduce the loan interest rate to benefit genuine borrower.
[0052] Fig. 3 illustrates a flow chart of a method 300 for identifying one or more fraudulent intermediary agents among a plurality of intermediary agents of an institution, according to an embodiment of the present disclosure. The plurality intermediary agents provide credit services to a plurality of end users on behalf of the institution. The method 300 may also be described in the general context of computer executable instructions. Generally, computer executable instructions may include routines, programs, objects, components, data structures, procedures, models, and functions, which perform specific functions or implement specific abstract data types.
[0053] The order in which the method 300 is described is not intended to be construed as a limitation, and any number of the described method blocks may be combined in any order to implement the method. Additionally, individual blocks may be deleted from the methods without departing from the spirit and scope of the subject matter described.
[0054] The method 300, at step 302, may comprise receiving a plurality of primary attributes related to each of the intermediary agents. In an embodiment, the plurality of primary attributes for each of the intermediary agents comprises one or more of: demographics details, outstanding credit value, expected credit collection value during a first time period, actual credit collection value, a number of new credit accounts, a total credit value of the new credit accounts, and a value of available credit reserve, but not limited thereto. The method considers the demographics details of the agents such as name, age, family size, occupation before they got into this business or any occupation that that they have alongside, but not limited thereto. The system also considers the pattern of this agent Further, on month or a week on week basis. For example, typically every week, if the agent repays X pounds. In a particular week, if the agent pays 50% more than X, or just 25% of X. That is a shift in the agents repayment pattern, which can indicate that perhaps the agent is actually committing a fraud and we need to run a deeper enquiry only in this particular agent.
[0055] At step 304, the method discloses storing the plurality of primary attributes in a database. The method 300, at step 306, may further disclose determining a plurality of secondary attributes indicative of conduct of each of the intermediary agents based on exploratory data analysis of the plurality of primary attributes. Particularly, after receiving the primary attributes, various additional factors/secondary attributes such as outstanding balance on agents' own bank loans, repayment pattern of all agents in the region with bank loans,
current and historical loan recovery value per agent, and actual amount recovered compared with anticipated amount per agent are analysed. Particularly, after receiving the primary attributes, the plurality of secondary attributes are determined. The secondary attributes are indicative of conduct of each of the intermediary agents based on exploratory data analysis of the plurality of primary attributes. The plurality of secondary attributes particularly comprises: outstanding credit value, regular credit recovery deviation from institution mean, zero credit recovery deviation from institution mean, regular credit recovery percentage over a second time period, zero credit recovery percentage over the second time period, float over the second time period, total credit value of the new credit accounts in a region of the institution, mean of percentage of credit collected, institution wise mean of percentage of credit collected over a first time period, institution wise zero credit recovery percentage mean, institution wise sum of the new credit accounts, a ratio of the total credit value of the new credit accounts and a number of the new credit accounts, but not limited thereto. However, before determining the secondary attributes, data pre-processing is performed on the primary attributes.
[0056] At step 308, the method 300 discloses generating a score indicative of fraudulent or non-fraudulent behavior of each of the intermediary agents by processing the plurality of secondary attributes. Moving on, the method 300, at step 310, discloses identifying the one or more fraudulent intermediary agents among the plurality of intermediary agents based on the generated score for each of the plurality of intermediary agents. Particularly, the pre-trained model may be used to generate the score for an agent based on the secondary attributes. The score indicates whether any agent is fraudulent or not. After generating the score, the one or more fraudulent intermediary agents among the plurality of intermediary agents are identified based on the generated score for each of the plurality of intermediary agents. In order to identify the one or more fraudulent agents based on the score, the generate score is compared with a threshold score. According to an embodiment, the score is a value between 0 to 1, but not limited thereto. Further, the score more than the threshold score indicates the fraudulent behaviour. In an example, the threshold score may be 0.50, but not limited thereto.
[0057] In an embodiment, the pre-trained model is trained based on historical primary and secondary attributes. The training of the pre-trained model may comprise receiving one or more datasets related to a plurality of intermediary agents from one or more of databases. The datasets may comprise a plurality of historical primary attributes associated with each of the plurality of intermediary agents. The plurality of intermediatory agents may comprise a set of
fraudulent intermediary agents and a set of nonfraudulent intermediary agents. The training of the pre-trained model may further comprise analysing the plurality of historical primary attributes to determine a plurality of historical secondary attributes associated with each of the set of fraudulent intermediary agents and the set of non-fraudulent intermediary agents and determining a threshold score indicative of a fraudulent behaviour of the intermediary agents based on the plurality of historical secondary attributes associated with each of the set of fraudulent intermediary agents and the set of non-fraudulent intermediary agents. Lastly, the model may be trained based on the determined threshold score and the plurality of historical secondary attributes associated with each of the set of fraudulent intermediary agents and the set of non-fraudulent intermediary agents, to identify one or more fraudulent intermediary agents in real-time. Furthermore, the model may also be trained for detecting false positives and false negatives in real-time, which further improves the accuracy.
[0058] Further, the method may comprises step of determining a pattern between the one or more intermediary agents vis-a-vis one or more end user, based on past detection or past suspicious behaviour. It is possible that a pair of user and agent in duo is committing fraud. Thus, such pattern determination is helpful to prevent fraud by agent/user.
[0059] In this manner, the method is able to identify the fraudulent agents. The automated agent fraud detection technique disclosed here supports the institution’s goals of reducing write-offs, and therefore operating costs helping the company to maintain profit margins and continue delivering shareholder value. These reductions help bank to: continue providing customers who are underbanked or underserved by mainstream credit operators with loans and lines of credit that are delivered responsibly and transparently, and maintain its positioning against payday, digital, bank operators and other home credit providers competing to serve credit to the same customer segment.
[0060] Further, the traditional approach of hand-coding rules for fraud detection doesn't allow a fraud detection engine to learn from new data. In contrast, the disclosed automated techniques significantly improves the sustainability. Additionally, the automated fraud detection techniques work without any human bias, while augmenting the capabilities of the bank fraud detection associates using it. The proportion of fraud per country and the influencing factors were previously unknown. Implementing the automated agent fraud detection technique provides bank with greater insight into both areas.
[0061] The illustrated steps are set out to explain the exemplary embodiments shown, and it should be anticipated that ongoing technological development will change the manner in which particular functions are performed. These examples are presented herein for purposes of illustration, and not limitation. Further, the boundaries of the functional building blocks have been arbitrarily defined herein for the convenience of the description. Alternative boundaries can be defined so long as the specified functions and relationships thereof are appropriately performed.
[0062] Alternatives (including equivalents, extensions, variations, deviations, etc., of those described herein) will be apparent to persons skilled in the relevant art(s) based on the teachings contained herein. Such alternatives fall within the scope and spirit of the disclosed embodiments.
[0063] Furthermore, one or more computer-readable storage media may be utilized in implementing embodiments consistent with the present disclosure. A computer-readable storage medium refers to any type of physical memory on which information or data readable by a processor may be stored. Thus, a computer-readable storage medium may store instructions for execution by one or more processors, including instructions for causing the processor(s) to perform steps or stages consistent with the embodiments described herein. The term “computer- readable medium” should be understood to include tangible items and exclude carrier waves and transient signals, i.e., are non-transitory. Examples include random access memory (RAM), read-only memory (ROM), volatile memory, non-volatile memory, hard drives, CD ROMs, DVDs, flash drives, disks, and any other known physical storage media.
[0064] Suitable processors include, by way of example, a general-purpose processor, a special purpose processor, a conventional processor, a digital signal processor (DSP), a graphic processing unit (GPU), a plurality of microprocessors, one or more microprocessors in association with a DSP core, a controller, a microcontroller, Application Specific Integrated Circuits (ASICs), Field Programmable Gate Arrays (FPGAs) circuits, any other type of integrated circuit (IC), and/or a state machine.
[0066] WE CLAIM:
1. A system to identify one or more fraudulent intermediary agents among a plurality of
intermediary agents of an institution, the plurality intermediary agents provide credit services
to a plurality of end users on behalf of the institution, the system comprises:
a receiving unit configured to receive a plurality of primary attributes related to each of the intermediary agents;
a database operatively coupled to the receiving unit and configured to store the plurality of primary attributes; and
a processing unit comprising a pre-trained model operatively coupled to the receiving unit and the database, and configured to:
determine a plurality of secondary attributes indicative of conduct of each of the intermediary agents based on exploratory data analysis of the plurality of primary attributes;
generate a score indicative of fraudulent or non-fraudulent behaviour of each of the intermediary agents by processing the plurality of secondary attributes; and
identify the one or more fraudulent intermediary agents among the plurality of intermediary agents based on the generated score for each of the plurality of intermediary agents.
2. The system as claimed in claim 1, wherein the plurality of primary attributes for each
of the intermediary agents comprises one or more of:
demographics details,
outstanding credit value,
expected credit collection value during a first time period,
actual credit collection value,
a number of new credit accounts,
a total credit value of the new credit accounts, and
a value of available credit reserve.
3. The system as claimed in claims 1-2, wherein the exploratory data analysis of the
plurality of primary attributes to determine the plurality of secondary attributes comprises
analyzing one or more of credit collection patterns the plurality of agents, current and historical
credit collection per agent, and actual credit collection in comparison to expected credit collection per agent.
4. The system as claimed in claim 1, wherein the plurality of secondary attributes comprises:
outstanding credit value,
regular credit recovery deviation from institution mean,
zero credit recovery deviation from institution mean,
regular credit recovery percentage over a second time period,
zero credit recovery percentage over the second time period,
float over the second time period,
total credit value of the new credit accounts in a region of the institution,
mean of percentage of credit collected,
institution wise mean of percentage of credit collected over a first time period,
institution wise zero credit recovery percentage mean,
institution wise sum of the new credit accounts,
a ratio of the total credit value of the new credit accounts and a number of the new
credit accounts.
5. The system as claimed in claim 1, wherein the pre-trained model is trained based on
historical primary and secondary attributes, and
wherein to train the pre-trained model, the processing unit is configured to:
receive one or more datasets related to a plurality of intermediary agents from one or more of databases, wherein the datasets comprises a plurality of historical primary attributes associated with each of the plurality of intermediary agents, wherein the plurality of intermediatory agents comprises a set of fraudulent intermediary agents and a set of non-fraudulent intermediary agents;
analyse the plurality of historical primary attributes to determine a plurality of historical secondary attributes associated with each of the set of fraudulent intermediary agents and the set of non-fraudulent intermediary agents;
determine a threshold score indicative of a fraudulent behaviour of the intermediary agents based on the plurality of historical secondary attributes associated with each of the set of fraudulent intermediary agents and the set of non-fraudulent intermediary agents; and
train the model, based on the determined threshold score and the plurality of historical secondary attributes associated with each of the set of fraudulent intermediary agents and the set of non-fraudulent intermediary agents, to identify one or more fraudulent intermediary agents in real-time.
6. The system as claimed in claim 5, wherein
the score is a value between 0 to 1; and
the score more than the threshold score indicates the fraudulent behaviour.
7. The system as claimed in claim 1, wherein the processing unit is configured:
determining a pattern between the one or more intermediary agents vis-a-vis one or
more end user, based on past detection or past suspicious behaviour.
8. The system as claimed in claim 5, wherein the processing unit is configured to train the model for detecting false positives and false negatives.
9. The system as claimed in claim 1, wherein the processing unit is configured to store the secondary attributes in the database.
10. A method for identifying one or more fraudulent intermediary agents among a plurality of intermediary agents of an institution, the plurality intermediary agents provide credit services to a plurality of end users on behalf of the institution, the method comprising:
receiving a plurality of primary attributes related to each of the intermediary agents;
storing the plurality of primary attributes in a database;
determining a plurality of secondary attributes indicative of conduct of each of the intermediary agents based on exploratory data analysis of the plurality of primary attributes;
generating a score indicative of fraudulent or non-fraudulent behaviour of each of the intermediary agents by processing the plurality of secondary attributes; and
identifying the one or more fraudulent intermediary agents among the plurality of intermediary agents based on the generated score for each of the plurality of intermediary agents.
11. The method as claimed in claim 10, wherein the plurality of primary attributes for each
of the intermediary agents comprises one or more of:
demographics details,
outstanding credit value,
expected credit collection value during a first time period,
actual credit collection value,
a number of new credit accounts,
a total credit value of the new credit accounts, and
a value of available credit reserve.
12. The method as claimed in claims 10-11, wherein the exploratory data analysis of the
plurality of primary attributes to determine the plurality of secondary attributes comprises
analyzing one or more of credit collection patterns the plurality of agents, current and historical
credit collection per agent, and actual credit collection in comparison to expected credit
collection per agent.
13. The method as claimed in claim 10, wherein the plurality of secondary attributes comprises:
outstanding credit value,
regular credit recovery deviation from institution mean,
zero credit recovery deviation from institution mean,
regular credit recovery percentage over a second time period,
zero credit recovery percentage over the second time period,
float over the second time period,
total credit value of the new credit accounts in a region of the institution,
mean of percentage of credit collected,
institution wise mean of percentage of credit collected over a first time period,
institution wise zero credit recovery percentage mean,
institution wise sum of the new credit accounts,
a ratio of the total credit value of the new credit accounts and a number of the new
credit accounts.
14. The method as claimed in claim 10, wherein the pre-trained model is trained based on
historical primary and secondary attributes, and
wherein training of the pre-trained model comprises:
receiving one or more datasets related to a plurality of intermediary agents from one or more of databases, wherein the datasets comprises a plurality of historical primary attributes associated with each of the plurality of intermediary agents, wherein the plurality of intermediatory agents comprises a set of fraudulent intermediary agents and a set of non-fraudulent intermediary agents;
analysing the plurality of historical primary attributes to determine a plurality of historical secondary attributes associated with each of the set of fraudulent intermediary agents and the set of non-fraudulent intermediary agents;
determining a threshold score indicative of a fraudulent behaviour of the intermediary agents based on the plurality of historical secondary attributes associated with each of the set of fraudulent intermediary agents and the set of non-fraudulent intermediary agents; and
training the model, based on the determined threshold score and the plurality of historical secondary attributes associated with each of the set of fraudulent intermediary agents and the set of non-fraudulent intermediary agents, to identify one or more fraudulent intermediary agents in real-time.
15. The method as claimed in claim 14, wherein
the score is a value between 0 to 1; and
the score more than the threshold score indicates the fraudulent behaviour.
16. The method as claimed in claim 10, further comprising:
determining a pattern between the one or more intermediary agents vis-a-vis one or more end user, based on past detection or past suspicious behaviour.
17. The method as claimed in claim 14, further comprises training the model for detecting false positives and false negatives.
18. The method as claimed in claim 1, further comprises storing the secondary attributes in the database.
| # | Name | Date |
|---|---|---|
| 1 | 202321012364-STATEMENT OF UNDERTAKING (FORM 3) [23-02-2023(online)].pdf | 2023-02-23 |
| 2 | 202321012364-PROOF OF RIGHT [23-02-2023(online)].pdf | 2023-02-23 |
| 3 | 202321012364-FORM 1 [23-02-2023(online)].pdf | 2023-02-23 |
| 4 | 202321012364-DRAWINGS [23-02-2023(online)].pdf | 2023-02-23 |
| 5 | 202321012364-DECLARATION OF INVENTORSHIP (FORM 5) [23-02-2023(online)].pdf | 2023-02-23 |
| 6 | 202321012364-COMPLETE SPECIFICATION [23-02-2023(online)].pdf | 2023-02-23 |
| 7 | 202321012364-FORM-26 [07-07-2023(online)].pdf | 2023-07-07 |
| 8 | Abstract1.jpg | 2023-08-07 |