Sign In to Follow Application
View All Documents & Correspondence

Method And System For Predicting Credit Risk Of A Borrower

Abstract: Disclosed herein is a system (102) and method to predict a potential credit risk (122) of the borrower throughout the credit repayment process based not only on the past financial activity, but also on various external factors that have direct impact on individuals such as government policies, local and global economic factors, stock indices, inflation, natural calamities etc. The present disclosure further employs various Artificial Intelligence (AI) based models to provide highly accurate and time-efficient predictions by removing any errors that may arise due to human intervention during the task of analysing such high-volumes of data obtained from various sources. [Figure 1]

Get Free WhatsApp Updates!
Notices, Deadlines & Correspondence

Patent Information

Application #
Filing Date
26 March 2021
Publication Number
39/2022
Publication Type
INA
Invention Field
COMPUTER SCIENCE
Status
Email
ipo@knspartners.com
Parent Application

Applicants

Zensar Technologies Limited
Plot #4, Zensar Knowledge Park, MIDC, Kharadi, Off Nagar Road, Pune, Maharashtra – 411014, India

Inventors

1. Katariyar, Saurabh
Zensar Technologies Ltd., Zensar Knowledge Park, Plot #4, MIDC, Kharadi, Off Nagar Road, Pune, Maharashtra – 411014, India
2. Kulkarni, Sumant
Zensar Technologies Ltd., Zensar Knowledge Park, Plot #4, MIDC, Kharadi, Off Nagar Road, Pune, Maharashtra – 411014, India
3. Dheenadayalan Kumar
Zensar Technologies Ltd., Zensar Knowledge Park, Plot #4, MIDC, Kharadi, Off Nagar Road, Pune, Maharashtra – 411014, India

Specification

FORM 2
THE PATENTS ACT, 1970
(39 of 1970)
&
THE PATENTS RULES, 2003
COMPLETE SPECIFICATION
(See section 10, rule 13)
1. Title of the Invention:
“A METHOD AND SYSTEM FOR PREDICTING CREDIT RISK
OF A BORROWER”
2. APPLICANT (S) -
(a) Name : Zensar Technologies Limited
(b) Nationality : Indian
(c)Address : Plot#4 Zensar Knowledge Park, MIDC, Kharadi,
Off Nagar Road, Pune, Maharashtra - 411014, India.
The following specification particularly describes the invention and the manner in which it is to be performed.

TECHNICAL FIELD
[001] The present invention relates to the field of data analysis, and more particularly to analyzing data obtained from various sources in order to dynamically predict credit risk of a borrower.
BACKGROUND OF INVENTION
[002] Lending companies such as banks and other financial institutions provide credit to borrowers for their financial needs. The borrowers can either be individuals or companies or business organizations belonging to different industrial sectors and/or having different job titles. The lending companies before granting a credit to the borrowers determine financial health of the borrower based on various factors such as occupation, annual income, credit rating, personal details and financial assets such as property ownerships, savings, insurance policies etc. However, it may so happen that after the credit has been granted to the borrower, his/her financial health declines due to various socio-economic factors such as inflation, recession, health issues, pandemic etc and he/she may not be able to pay equated monthly instalments (EMI) to the lending company for the granted credit on time and thereby become a defaulter. Defaulting credit has an adverse effect on the financial health of the lending company as well and therefore, it would be beneficial for the lending companies to dynamically assess the credit risk of the active credit accounts throughout the credit repayment process in order to determine whether a borrower will be in a position to pay his/her EMIs on time or not.
[003] Based on above, there is therefore a need for a system that monitors, assesses and predicts a potential credit risk of the borrower throughout the credit repayment process based not only on the past financial activity, but also on various external factors that have direct impact on individuals such as government policies, local and global economic factors, stock indices, inflation, natural calamities etc.
[004] The information disclosed in this background of the disclosure section is only for enhancement of understanding of the general background of the invention and should not be taken as an acknowledgement or any form of suggestion that this information forms the prior art already known to a person skilled in the art.

SUMMARY OF INVENTION
[005] The present disclosure overcomes one or more shortcomings of the prior art and provides additional advantages discussed throughout the present disclosure. Additional features and advantages are realized through the techniques of the present disclosure. Other embodiments and aspects of the disclosure are described in detail herein and are considered a part of the claimed disclosure.
[006] In one embodiment of the present disclosure, a method for dynamically predicting a credit risk of a borrower is disclosed. The method comprises fetching borrower data, macroeconomic data, and media data from one or more data sources, where the borrower data comprises borrower’s sector. The method further comprises computing a plurality of relevant attributes from the borrower data, where each of the plurality of relevant attributes impacts the credit risk of the borrower. The method further comprises extracting a plurality of sectors, from the macroeconomic data and the media data, by employing a Natural Language Processing (NLP) technique. The method further comprises generating a plurality of influence scores corresponding to the plurality of sectors, where each influence score indicates a level of impact on each sector derived from currently available information about each sector as per the macroeconomic data and the media data. The method further comprises selecting a set of sectors, amongst the plurality of sectors, based on the plurality of influence scores, where the set of sectors is selected such that each sector has an influence score above a threshold influential score. The method further comprises selecting one or more pertinent sectors, amongst the set of sectors, related to the borrower’ssector. The method further comprises applying the plurality of relevant attributes and the one or more pertinent sectors to a pre-trained model in order to predict the credit risk of the borrower, where the credit risk of the borrower is predicted by classifying the borrower in a risk class amongst a plurality of risk classes.
[007] In one embodiment of the present disclosure, a system for dynamically predicting a credit risk of a borrower is disclosed. The system comprises a fetching unit configured to fetch borrower data, macroeconomic data, and media data from one or more data sources, where the borrower data comprises borrower’s sector. The system further comprises a

computation unit configured to compute a plurality of relevant attributes from the borrower data, where each of the plurality of relevant attributes impacts the credit risk of the borrower. The system further comprises an extraction unit configured to extract a plurality of sectors, from the macroeconomic data and the media data, by employing a Natural Language Processing (NLP) technique. The system further comprises a generation unit configured to generate a plurality of influence scores corresponding to the plurality of sectors, where each influence score indicates a level of impact on each sector derived from currently available information about each sector as per the macroeconomic data and the media data. The system further comprises a selection unit configured to select a set of sectors, amongst the plurality of sectors, based on the plurality of influence scores, where the set of sectors is selected such that each sector has an influence score above a threshold influential score and select one or more pertinent sectors, amongst the set of sectors, related to the borrower’s sector. The system further comprises a prediction unit configured to apply the plurality of relevant attributes and the one or more pertinent sectors to a pre-trained model in order to predict the credit risk of the borrower, wherein the credit risk of the borrower is predicted by classifying the borrower in a risk class amongst a plurality of risk classes.
[008] The foregoing summary is illustrative only and is not intended to be in any way limiting. In addition to the illustrative aspects, embodiments, and features described above, further aspects, embodiments, and features will become apparent by reference to the drawings and the following detailed description.
BRIEF DESCRPTION OF DRAWINGS
[009] The embodiments of the disclosure itself, as well as a preferred mode of use, further objectives and advantages thereof, will best be understood by reference to the following detailed description of an illustrative embodiment when read in conjunction with the accompanying drawings. One or more embodiments are now described, by way of example only, with reference to the accompanying drawings in which:
[0010] Figure 1 shows an exemplary environment 100 of a system for dynamically predicting a credit risk of a borrower, in accordance with an embodiment of the present disclosure;

[0011] Figure 2 shows a block diagram 200 illustrating a system for dynamically predicting a credit risk of a borrower, in accordance with an embodiment of the present disclosure;
[0012] Figure 3A shows a method 300A for dynamically predicting a credit risk of a borrower, in accordance with an embodiment of the present disclosure;
[0013] Figure 3B shows a method 300B for computing a plurality of relevant attributes, in accordance with an embodiment of the present disclosure; and
[0014] Figure 4 shows a block diagram of an exemplary computer system 400 for implementing the embodiments consistent with the present disclosure.
[0015] The figures depict embodiments of the disclosure for purposes of illustration only. One skilled in the art will readily recognize from the following description that alternative embodiments of the structures and methods illustrated herein may be employed without departing from the principles of the disclosure described herein.
DETAILED DESCRIPTION OF THE INVENTION
[0016] The foregoing has broadly outlined the features and technical advantages of the present disclosure in order that the detailed description of the disclosure that follows may be better understood. It should be appreciated by those skilled in the art that the conception and specific embodiment disclosed may be readily utilized as a basis for modifying or designing other structures for carrying out the same purposes of the present disclosure.
[0017] The novel features which are believed to be characteristic of the disclosure, both as to its organization and method of operation, together with further objects and advantages will be better understood from the following description when considered in connection with the accompanying figures. It is to be expressly understood, however, that each of the figures is provided for the purpose of illustration and description only and is not intended as a definition of the limits of the present disclosure.

[0018] Disclosed herein is a system and method for dynamically predicting a credit risk of a borrower. Lending companies such as banks and other financial institutions while providing credit to a borrower only assesses the financial health of the borrower such as occupation, annual income, credit rating, financial assets such as property ownerships, savings, insurance policies etc and personal details. These factors act as a basis for decision for the lending company on whether or not the borrower is a suitable candidate for the grant of credit. However, even after assessing the suitability of the borrower for the grant of credit, the lending company cannot be entirely sure whether the borrower would repay the credit timely as there is no mechanism that allows the lending companies to monitor and assess the financial health of the borrower throughout the credit repayment process. For instance, considering the situation of the Covid-19 pandemic that started in India in the year 2020, resulting a nationwide lockdown which affected the financial health of many individuals and companies. Now, taking an example of an individual that took a credit from a lending company such as a bank for setting up a gymnasium a few months before the lockdown. The lending company would have granted the credit to the individual based on his/her financial health at the time of requesting for credit. But due to the pandemic, gymnasiums across the country were shut. The shutting down of gymnasiums would directly affect the financial health of the individual that took credit from the lending company and he/she might not be in a position to repay the monthly instalments of the issued credit in a timely fashion. This would in turn affect the financial health of the lending company. Therefore, it is not just advisable, but beneficial for the lending companies to monitor, assess and dynamically predict the credit risk of the borrower throughout the credit repayment process.
[0019] The present disclosure understands this need and provides a system and method to determine a potential credit risk of the borrower throughout the credit repayment process based not only on the past financial activity, but also on various external factors that have direct impact on individuals such as government policies, local and global economic factors, stock indices, inflation, natural calamities etc. The present disclosure further employs various Artificial Intelligence (AI) based models to provide highly accurate and time-efficient predictions by removing any errors that may arise due to human intervention

during the task of analysing such high-volumes of data obtained from various sources. The detailed working of the system has been explained in the upcoming paragraphs.
[0020] Figure 1 shows an exemplary environment 100 of a system for dynamically predicting a credit risk of a borrower, in accordance with an embodiment of the present disclosure. It must be understood to a person skilled in art that the system may also be implemented in various environments, other than as shown in Fig. 1.
[0021] The detailed explanation of the exemplary environment 100 is explained in conjunction with Figure 2 that shows a block diagram 200 of a system 102 for dynamically predicting a credit risk 122 of a borrower, in accordance with an embodiment of the present disclosure. Although the present disclosure is explained considering that the system 102 is implemented on a server, it may be understood that the system 102 may be implemented in a variety of computing systems, such as a laptop computer, a desktop computer, a notebook, a workstation, a mainframe computer, a server, a network server, a cloud-based computing environment.
[0022] In one implementation, the system 102 may comprise an I/O interface 202, a processor 204, a memory 206 and the units 208. The memory 206 may be communicatively coupled to the processor 204 and the units 208. Further, the memory 208 may store multi-modal data 206A and a pre-trained model 120. The significance and use of each of the stored quantities is explained in the upcoming paragraphs of the specification. The processor 204 may be implemented as one or more microprocessors, microcomputers, microcontrollers, digital signal processors, central processing units, state machines, logic circuitries, and/or any devices that manipulate signals based on operational instructions. Among other capabilities, the processor 204 is configured to fetch and execute computer-readable instructions stored in the memory 206. The I/O interface 202 may include a variety of software and hardware interfaces, for example, a web interface, a graphical user interface, and the like. The I/O interface 202 may enable the system 102 to communicate with other computing devices, such as web servers and external data servers (not shown). The I/O interface 202 may facilitate multiple communications within a wide variety of networks and protocol types, including wired networks, for example, LAN, cable, etc., and wireless

networks, such as WLAN, cellular, or satellite. The I/O interface 202 may include one or more ports for connecting many devices to one another or to another server.
[0023] In one implementation, the units 208 may comprise a fetching unit 210, a computation unit 212, an extraction unit 214, a generation unit 216, a selection unit 218, a prediction unit 220 and an explainability unit 222. The computation unit 212 further comprises a refinement unit 212A, an attribute extraction unit 212B, an imputation unit 212C, an encoding unit 212D, and an attribute processing unit 212E. According to embodiments of present disclosure, these units 210-222 may comprise hardware components like processor, microprocessor, microcontrollers, application-specific integrated circuit for performing various operations of the system 102. It must be understood to a person skilled in art that the processor 204 may perform all the functions of the units 210-222 according to various embodiments of the present disclosure.
[0024] Now referring to figure 1, the environment 100 shows a system 102that receives multi-modal data 104 - 114 from various sources and analyses the received multi-modal data to dynamically predict a credit risk 122 of a borrower. The credit risk 122 of the borrower is predicted by classifying the borrower in a risk class amongst a plurality of risk classes. Further, since the credit risk 122 is dynamically predicted, the risk class in which the borrower is classified may change through the entire credit repayment process which is explained in detail in the upcoming paragraphs. Now, the working of the system 102 is explained by taking an example of an individual (say John) who is a restaurant owner and wants to take a credit for expanding his restaurant business. John approaches a lending company (for instance, XYZ Bank) in June 2019, with the request for credit where the financial health of John is assessed by the XYZ Bank by fetching borrower data 104 which comprises the industrial sector to which John belongs along with his credit data 106, bureau data 108 and personal data 110 as tabulated in the table 1 below:
Table 1: Borrower Data 104

Borrower Data (104)
Borrower’s Sector Credit Data (106) Bureau Data Personal Data
EMI history Credit grade Address

Industrial Sector or
Professional Job
Title Account Activity Sub-credit grade Occupation

Debt-to-Income Ratio Employment Length

Late Payments Annual Income

Credit rating/score Marital Status

Credit lines Financial Assets

Amount requested

Risk Score
It may be noted by a skilled person that the various data fields of the borrower data 104 as described in table 1 is exemplary and should not be construed to be limiting. Accordingly, the borrower data 104 may also include other data fields.
The borrower data 104 pertaining to John is stored in a database and is periodically updated. Now, based on the borrower data 104, John is granted a credit in August 2019 which must be repaid for instance, within a period of 3 years starting from the date of grant. Therefore, starting from August 2019, John has an active credit account with the XYZ Bank. The system 102 now predicts a credit risk 122 pertaining to John’s active credit account in say, December 2019.
For this purpose, the fetching unit 210 fetches the borrower data 104 from the database, macro-economic data 112 and media data 114 from one or more data sources. In one embodiment, the macroeconomic data 114 comprises at least one of government policies, stock indices, financial reports and inflation reports and may be obtained from one or more financial sources such as GoogleTM Finance. In one embodiment, the media data 114 comprises a plurality of news articles and social media feeds and may be obtained from one or more online news repositories such as GoogleTM News and various social media websites.
Once, the multi-modal data 104-114 has been fetched by the fetching unit 210, the system 102 performs distinct operations in two parallel pipelines. The first pipeline processes the borrower data 104 for attribute computation 116. For this purpose, the borrower data 104

fetched from the database is fed to the computation unit 212 to compute a plurality of relevant attributes that would impact the credit risk 122 of John. As described above, the computation unit 212 further comprises the refinement unit 212A, the attribute extraction unit 212B, the imputation unit 212C, an encoding unit 212D, and an attribute processing unit 212E each of which performs a dedicated task in order to compute the plurality of relevant attributes. The borrower data 104 is first provided to the refinement unit 212A that checks the data type of the borrower data 104 and converts the borrower data 104 into a required data type. Once the borrower data 104 has been converted to a required data type, the attribute extraction unit 212B extracts a plurality of attributes from the refined borrower data. In one embodiment, the plurality of attributes comprises at least one of EMI payment history, borrower’s sector, annual income, credit grade and credit rating etc. Now, once the attributes are extracted, it is checked whether any entries related to any of the plurality of attributes is missing. In accordance with the exemplary embodiment, it is observed that John’s credit score has a missing entry. To fill the missing entry, the imputation unit 212C performs a correlation analysis and statistical imputation. For instance, the imputation unit 212C performs a correlation between John’s account activity and his previous EMI payments to impute his credit score. Once, the missing entries have been imputed, the plurality of attributes are provided to the encoding unit 212D, where each attribute is encoded in a format depending upon the type of attribute. In one embodiment, if the attribute is categorical and ordinal, such as the credit-grade, the encoding unit 212D encodes the attribute through label encoding. In another embodiment, if the attribute is categorical and nominal, such as residential state of the borrower, the encoding unit 212D encodes the attribute through one-hot encoding. In yet another embodiment, if the attribute has a date-time format, such as duration for which credit is taken, the encoding unit 212D encodes the attribute through cyclic encoding. The plurality of encoded attributes is then provided to the attribute processing unit 212E. The attribute processing unit 212E determines a permutation-based importance of each of the encoded attributes in order to compute the plurality of relevant attributes that are highly important in predicting the credit risk of the borrower.
The second pipeline performs NLP sector extraction 118 on the macroeconomic data 112 and media data 114 to obtain one or more pertinent sectors related to the borrower, that are

impacted as per the information currently available in the macroeconomic data 112 and media data 114. For this purpose, the extraction unit 214 extracts a plurality of sectors from the macroeconomic data 112 and the media data 114 by employing a Natural Language Processing (NLP) technique. In one embodiment, the plurality of sectors corresponds to industrial sectors. In accordance, with the exemplary embodiment, the system 102 is predicting the credit risk of John’s active credit account in December 2019. Therefore, the extraction unit 214 focusses on the news articles, stock indices, financial reports, inflation reports and social media feeds for that particular time period (say August 2019 - December 2019) and extracts a plurality of sectors that are affected based on the information available in the news articles, stock indices, financial reports, inflation reports and social media feeds. In accordance, with the exemplary embodiment, it is observed that due to certain government policies and because of inflation, the sectors affected are mining, farm equipments manufacture, restaurant and catering sector and multiplexes. These sectors therefore form the plurality of sectors extracted by the extraction unit 214. It may be noted to a person skilled in art that above mentioned sectors are merely an example and there may be other sectors also which may be considered in the present disclosure.
Further, the generation unit 216 generates a plurality of influence scores corresponding to the plurality of sectors extracted by the extraction unit 214. Each influence score indicates a level of impact on each sector as derived from the available information about each sector as per the macroeconomic data 112 and the media data 114. Now, in accordance with the exemplary embodiment, the plurality of influence scores generated for the plurality of sectors extracted by the extraction unit 214 are listed in table 2 below:
Table 2: Plurality of sectors and the corresponding influence scores

Sector Influence Score (Max 1.0)
Farm Equipments Manufacture 0.6
Mining 0.4
Restaurant and Catering 0.5
Multiplexes 0.7

In the exemplary embodiment, depicted in table 2, the plurality of influence scores is generated in the form of decimal values such that an influence score of 1.0 would indicate maximum impact. However, it must be understood that the plurality of influence scores may be generated in the other forms including but not limited to percentage, grades and integral numerical values.
Based on the generated plurality of influence scores, the selection unit 218 selects a set of sectors, each of which have an influence score equal to or above a threshold influence score. For instance, if the threshold influence score is 0.5, the set of sectors selected by the selection unit 218 includes farm equipments manufacture, restaurant and catering and multiplexes. Further, the selection unit 218 selects one or more pertinent sectors amongst the set of sectors that are related to the borrower’s sector. In accordance with the exemplary embodiment, since John is into restaurant business, the one or more pertinent sectors would include “Restaurant and Catering”.
Next, the computed plurality of relevant attributes from the first pipeline and the selected one or more pertinent sectors from the second pipeline are provided to the prediction unit 220 that applies the pre-trained model 220 to predict the credit risk 122 of the borrower. The credit risk 122 of the borrower is predicted by classifying the borrower in a risk class amongst a plurality of risk classes. In one embodiment, the risk classes comprises at least one of a fully-paid class, a charged off class, a late class, a normal class and an in-grace period class. In another embodiment, the prediction unit 122 besides classifying the borrower in a risk class, may also output a probability score. For instance, in accordance with the exemplary embodiment, based on the influence score of the “restaurant and catering” sector and the plurality of relevant attributes, the prediction unit 220 assigns John to the normal class and provides a probability score as 0.7. This may help the bank XYZ to understand that there is high probability that John would make good on his monthly EMI payments.
However, in another embodiment, if the influence score of the sector pertinent to the borrower’s sector is below a threshold, the prediction unit 220 may understand that the borrower’s sector is not readily affected by fluctuations in market and economy and may

therefore, predict the credit risk 122 based on the plurality of relevant attributes computed from the borrower data 104.
Further, since the objective of the present disclosure is to provide dynamic credit risk prediction, the system 102 performs the operations of the first pipeline and the second pipeline periodically in order to update the credit risk 122 of the borrower. To elaborate this further in accordance with the exemplary embodiment, the system 102 predicts a credit risk 122 pertaining to John’s active credit account in say, June 2020 in order to update the credit risk 122 predicted during December 2019.
Post-fetching the multi-modal data 104-114 by the fetching unit 210, the system 102 performs similar first pipeline operations in order to compute the plurality of relevant attributes in a manner as described in preceding paragraphs.
The second pipeline processes the macroeconomic data 112 and media data 114 to obtain one or more pertinent sectors related to the borrower, that are impacted as per the information available in the macroeconomic data 112 and media data 114. For this purpose, the extraction unit 214 extracts a plurality of sectors from the macroeconomic data 112 and the media data 114 by employing a Natural Language Processing (NLP) technique. In accordance with the exemplary embodiment the extraction unit 214 focusses on the news articles, stock indices, financial reports, inflation reports and social media feeds for a time period, say, January 2020 – June 2020 and extracts a plurality of sectors that are affected based on the information available in the news articles, stock indices, financial reports, inflation reports and social media feeds. In accordance, with the exemplary embodiment, it is observed that due to Covid-19 pandemic across the nation, a plurality of a sectors is affected. These plurality of sectors are extracted by the extraction unit 212.
Now, based on the influence scores generated by the generation unit 214 for each of the plurality of sectors in a similar fashion as described above, and based on the borrower’s sector, one or more pertinent sectors are selected by the selection unit 216.

In accordance with the exemplary embodiment, the “restaurant and catering” sector is assigned a high influence score of 0.9 indicating that the restaurant and catering sector has been highly impacted because of the Covid-19 pandemic.
In accordance with the exemplary embodiment, the selected sector and the plurality of relevant attributes are provided to the pre-trained model 120, and John’s risk class may be updated to “Late class” with a probability score of 0.6 indicating that there is high probability that John may be late with his future monthly EMI payments.
Therefore, the credit risk prediction is dynamically updated based on the current financial situation of the borrower and the current economic and market situation.
Furthermore, to provide a confidence to the analysts and regulatory bodies in the credit risk prediction provided by the system 102, the system 102 comprises an explainability unit 222 that outputs the influence of each of the plurality of relevant attributes on the decision-making process through the I/O Interface 202. In other words, the explainability unit 222 allows an analyst to understand why a borrower has been placed in a certain risk class.
Figure 3A depicts a method 300A for dynamically predicting credit risk of a borrower, in accordance with an embodiment of the present disclosure.
As illustrated in figure 3A, the method 300A includes one or more blocks illustrating a method for dynamically predicting credit risk of a borrower. The method 300A may be described in the general context of computer executable instructions. Generally, computer executable instructions can include routines, programs, objects, components, data structures, procedures, modules, and functions, which perform specific functions or implement specific abstract data types.
The order in which the method 300A is described is not intended to be construed as a limitation, and any number of the described method blocks can be combined in any order to implement the method. Additionally, individual blocks may be deleted from the methods without departing from the spirit and scope of the subject matter described.

At block 302, the method 300 may include fetching borrower data 104, macroeconomic data 112, and media data 114 from one or more data sources.
At block 304, the method 300 may include computing a plurality of relevant attributes from the borrower data 104. The plurality of relevant attributes impacts the credit risk 122 of the borrower. Further, the method for computing the plurality of relevant attributes is described in figure 3B discussed in upcoming paragraphs.
At block 306, the method 300 may include extracting a plurality of sectors, from the macroeconomic data 112 and the media data 114, by employing a Natural Language Processing (NLP) technique.
At block 308, the method 300 may include generating a plurality of influence scores corresponding to the plurality of sectors. Each influence score indicates a level of impact on each sector derived from currently available information about each sector as per the macroeconomic data 112 and the media data 114.
At block 310, the method 300 may include selecting a set of sectors, amongst the plurality of sectors, based on the plurality of influence scores.
At block 312, the method 300 may include improving selecting one or more pertinent sectors, amongst the set of sectors, related to the borrower’s sector.
At block 314, the method 300 may include applying the plurality of relevant attributes and the one or more pertinent sectors to a pre-trained model 120 in order to dynamically predict the credit risk 122 of borrower. The credit risk 122 of the borrower is predicted by classifying the borrower in a risk class amongst a plurality of risk classes.
Figure 3B shows a method 300B for computing a plurality of relevant attributes, in accordance with an embodiment of the present disclosure.
As illustrated in figure 3B, the method 300B includes one or more blocks illustrating a method for computing a plurality of relevant attributes. The method 300B may be described in the general context of computer executable instructions. Generally, computer executable

instructions can include routines, programs, objects, components, data structures, procedures, modules, and functions, which perform specific functions or implement specific abstract data types.
[0055] The order in which the method 300B is described is not intended to be construed as a limitation, and any number of the described method blocks can be combined in any order to implement the method. Additionally, individual blocks may be deleted from the methods without departing from the spirit and scope of the subject matter described.
[0056] At block 304-1, the method 300B may include refining the borrower data 104 in order to convert each of the borrower data 104 to a required data type.
[0057] At block 304-2, the method 300B may include extracting a plurality of attributes from the borrower data 104 refined.
[0058] At block 304-3, the method 300B may include performing a statistical imputation on the plurality of attributes.
[0059] At block 304-4, the method 300B may include encoding the plurality of attributes after being statistically imputed by performing at least one of a label encoding, a one-hot encoding and a cyclic encoding.
[0060] At block 304-5, the method 300B may include performing a permutation-based attribute importance to compute the plurality of relevant attributes.
Computer System
[0061] Figure 4 illustrates a block diagram of an exemplary computer system 400 for implementing embodiments consistent with the present disclosure. It may be understood to a person skilled in art that the computer system 400 and its components is similar to the system 102 referred in Fig. 2. In an embodiment, the computer system 400 may be a peripheral device, which is used for predicting a credit risk of a borrower. The computer system 400 may include a central processing unit (“CPU” or “processor”) 402. The processor 402 may comprise at least one data processor for executing program components for executing user or system-generated business processes. The processor 402 may include

specialized processing units such as integrated system (bus) controllers, memory management control units, floating point units, graphics processing units, digital signal processing units, etc.
[0062] The processor 402 may be disposed in communication with one or more input/output (I/O) devices via I/O interface 401. The I/O interface 401 may employ communication protocols/methods such as, without limitation, audio, analog, digital, stereo, IEEE-1394, serial bus, Universal Serial Bus (USB), infrared, PS/2, BNC, coaxial, component, composite, Digital Visual Interface (DVI), high-definition multimedia interface (HDMI), Radio Frequency (RF) antennas, S-Video, Video Graphics Array (VGA), IEEE 802.n /b/g/n/x, Bluetooth, cellular (e.g., Code-Division Multiple Access (CDMA), High-Speed Packet Access (HSPA+), Global System For Mobile Communications (GSM), Long-Term Evolution (LTE) or the like), etc. Using the I/O interface, the computer system 400 may communicate with one or more I/O devices.
[0063] In some embodiments, the processor 402 may be disposed in communication with a communication network 414 via a network interface 403. The network interface 403 may communicate with the communication network 414. The communication unit may employ connection protocols including, without limitation, direct connect, Ethernet (e.g., twisted pair 10/100/1000 Base T), Transmission Control Protocol/Internet Protocol (TCP/IP), token ring, IEEE 802.11a/b/g/n/x, etc.
[0064] The communication network 414 can be implemented as one of the several types of networks, such as intranet or Local Area Network (LAN) and such within the organization. The communication network 414 may either be a dedicated network or a shared network, which represents an association of several types of networks that use a variety of protocols, for example, Hypertext Transfer Protocol (HTTP), Transmission Control Protocol/Internet Protocol (TCP/IP), Wireless Application Protocol (WAP), etc., to communicate with each other. Further, the communication network 414 may include a variety of network devices, including routers, bridges, servers, computing devices, storage devices, etc.
[0065] In some embodiments, the processor 402 may be disposed in communication with a memory 405 (e.g., RAM 412, ROM 413, etc. as shown in FIG. 4) via a storage interface

404. The storage interface 404 may connect to the memory 405 including, without limitation, memory drives, removable disc drives, etc., employing connection protocols such as Serial Advanced Technology Attachment (SATA), Integrated Drive Electronics (IDE), IEEE-1394, Universal Serial Bus (USB), fiber channel, Small Computer Systems Interface (SCSI), etc. The memory drives may further include a drum, magnetic disc drive, magneto-optical drive, optical drive, Redundant Array of Independent Discs (RAID), solid-state memory devices, solid-state drives, etc.
[0066] The memory 405 may store a collection of program or database components, including, without limitation, user /application, an operating system, a web browser, mail client, mail server, web server and the like. In some embodiments, computer system may store user /application data, such as the data, variables, records, etc. as described in this invention. Such databases may be implemented as fault-tolerant, relational, scalable, secure databases such as OracleR or SybaseR.
[0067] The operating system may facilitate resource management and operation of the computer
system. Examples of operating systems include, without limitation, APPLE
MACINTOSHR OS X, UNIXR, UNIX-like system distributions (E.G., BERKELEY
SOFTWARE DISTRIBUTIONTM (BSD), FREEBSDTM, NETBSDTM, OPENBSDTM,
etc.), LINUX DISTRIBUTIONSTM (E.G., RED HATTM, UBUNTUTM,
KUBUNTUTM, etc.), IBMTM OS/2, MICROSOFTTM WINDOWSTM (XPTM,
VISTATM/7/8, 10 etc.), APPLER IOSTM, GOOGLER ANDROIDTM,
BLACKBERRYR OS, or the like. A user interface may facilitate display, execution, interaction, manipulation, or operation of program components through textual or graphical facilities. For example, user interfaces may provide computer interaction interface elements on a display system operatively connected to the computer system, such as cursors, icons, check boxes, menus, windows, widgets, etc. Graphical User Interfaces (GUIs) may be employed, including, without limitation, APPLE MACINTOSHR operating systems, IBMTM OS/2, MICROSOFTTM WINDOWSTM (XPTM, VISTATM/7/8, 10 etc.), UnixR X-Windows, web interface libraries (e.g., AJAXTM, DHTMLTM, ADOBE® FLASHTM, JAVASCRIPTTM, JAVATM, etc.), or the like.

[0068] Furthermore, one or more computer-readable storage media may be utilized in implementing embodiments consistent with the present invention. A computer-readable storage medium refers to any type of physical memory on which information or data readable by a processor may be stored. Thus, a computer-readable storage medium may store instructions for execution by one or more processors, including instructions for causing the processor(s) to perform steps or stages consistent with the embodiments described herein. The term “computer-readable medium” should be understood to include tangible items and exclude carrier waves and transient signals, i.e., non-transitory. Examples include Random Access Memory (RAM), Read-Only Memory (ROM), volatile memory, non-volatile memory, hard drives, Compact Disc (CD) ROMs, Digital Video Disc (DVDs), flash drives, disks, and any other known physical storage media.
[0069] A description of an embodiment with several components in communication with each other does not imply that all such components are required. On the contrary, a variety of optional components are described to illustrate the wide variety of possible embodiments of the invention.
[0070] When a single device or article is described herein, it will be clear that more than one device/article (whether they cooperate) may be used in place of a single device/article. Similarly, where more than one device or article is described herein (whether they cooperate), it will be clear that a single device/article may be used in place of the more than one device or article or a different number of devices/articles may be used instead of the shown number of devices or programs. The functionality and/or the features of a device may be alternatively embodied by one or more other devices which are not explicitly described as having such functionality/features. Thus, other embodiments of the inventionneed not include the device itself.
[0071] Finally, the language used in the specification has been principally selected for readability and instructional purposes, and it may not have been selected to delineate or circumscribe the inventive subject matter. It is therefore intended that the scope of the invention be limited not by this detailed description, but rather by any claims that issue on an application based here on. Accordingly, the embodiments of the present invention are intended to be

illustrative, but not limiting, of the scope of the invention, which is set forth in the following claims.
[0072] While various aspects and embodiments have been disclosed herein, other aspects and embodiments will be apparent to those skilled in the art. The various aspects and embodiments disclosed herein are for purposes of illustration and are not intended to be limiting, with the true scope and spirit being indicated by the following claims.
Reference Numerals:

Reference Numeral Description
100 Exemplary environment of a system for dynamically predicting credit risk of a borrower
102 System
104 Borrower Data
106 Credit Data
108 Bureau Data
110 Personal Data
112 Macroeconomic Data
114 Media Data
116 Attribute Computation
118 NLP Sector Extraction
120 Pre-trained Model
122 Credit Risk
200 Block diagram of the system
202, 401 I/O Interface
204, 402 Processor

206, 405
Memory
206A Multi-modal data
208 Units
210 Fetching Unit
212 Computation Unit
212A Refinement Unit
212B Attribute Extraction Unit
212C Imputation Unit
212D Encoding Unit
212E Attribute Processing Unit
214 Extraction Unit
216 Generation Unit
218 Selection Unit
220 Prediction Unit
222 Explainability Unit
300A Method for dynamically predicting credit risk of a borrower
300B Method for computing a plurality of relevant attributes
400 Computing System
403 Network Interface
404 Storage Interface
406 User Application
407 Operating System
408 Web Browser
409 Mail Client

410
Mail Server
411 Web Server
412 RAM
413 ROM
414 Communication Network
415 Input Devices
416 Output Devices

We Claim:
1. A method for dynamically predicting a credit risk (122) of a borrower, the method
comprising:
fetching (302) borrower data (104), macroeconomic data (112), and media data (114) from one or more data sources, wherein the borrower data (104) comprises borrower’s sector;
computing (304) a plurality of relevant attributes from the borrower data (104), wherein each of the plurality of relevant attributes impacts the credit risk (122) of the borrower;
extracting (306) a plurality of sectors, from the macroeconomic data and the media data, by employing a Natural Language Processing (NLP) technique;
generating (308) a plurality of influence scores corresponding to the plurality of sectors, wherein each influence score indicates a level of impact on each sector derived from currently available information about each sector as per the macroeconomic data (112) and the media data (114);
selecting (310) a set of sectors, amongst the plurality of sectors, based on the plurality of influence scores, wherein the set of sectors is selected such that each sector has an influence score above a threshold influential score;
selecting (312) one or more pertinent sectors, amongst the set of sectors, related to the borrower’s sector; and
applying (314) the plurality of relevant attributes and the one or more pertinent sectors to a pre-trained model (120) in order to dynamically predict the credit risk of the borrower, wherein the credit risk (122) of the borrower is predicted by classifying the borrower in a risk class amongst a plurality of risk classes.
2. The method as claimed in claim 1, wherein:
the borrower data (104) further comprises:
credit data (106) comprising at least one of a loan account activity of the borrower, a monthly equated monthly instalment (EMI) history and late payments;
bureau data (108) comprising at least one of a credit grade and a sub-credit grade associated with the borrower;
personal data (110) comprising at least one of a borrower address, marital status, occupation, annual income and credit rating;
the macro-economic data (112) comprises at least one of stock indices, financial reports and inflation reports; and
the media data (114) comprises a plurality of news articles and a plurality of social media feeds.
3. The method as claimed in claim 1, wherein computing (304) the plurality of relevant
attributes comprises:
refining (304-1) the borrower data (104) in order to convert each of the borrower data (104) to a required data type;
extracting (304-2) a plurality of attributes from the borrower data (104) refined;

performing (304-3) a statistical imputation on the plurality of attributes; encoding (304-4) the plurality of attributes after being statistically imputed by performing at least one of a label encoding, a one-hot encoding and a cyclic encoding, wherein:
label encoding is performed on a categorical and ordinal attribute; one-hot encoding is performed on a categorical and nominal attribute; and cyclic encoding is performed on an attribute having date-time format; and performing (304-5) a permutation-based attribute importance to compute the plurality of relevant attributes.
4. The method as claimed in claim 1, wherein the pre-trained model (120) is at least one of a mid-fusion model and a late-fusion model.
5. The method as claimed in claim 1, wherein the plurality of risk classes comprises at least one of a fully-paid class, a charged off class, a late class, a normal class and an in-grace period class.
6. A system (102) for dynamically predicting a credit risk (122) of a borrower, the system (102) comprising:
a fetching unit (210) configured to fetch borrower data (104), macroeconomic data (112), and media data (114) from one or more data sources, wherein the borrower data (104) comprises borrower’s sector;
a computation unit (212) configured to compute a plurality of relevant attributes from the borrower data (104), wherein each of the plurality of relevant attributes impacts the credit risk (122) of the borrower;
an extraction unit (214) configured to extract a plurality of sectors, from the macroeconomic data (112) and the media data (114), by employing a Natural Language Processing (NLP) technique;
a generation unit (216) configured to generate a plurality of influence scores corresponding to the plurality of sectors, wherein each influence score indicates a level of impact on each sector derived from currently available information about each sector as per the macroeconomic data (112) and the media data (114);
a selection unit (218) configured to:
select a set of sectors, amongst the plurality of sectors, based on the plurality
of influence scores, wherein the set of sectors is selected such that each sector has
an influence score above a threshold influential score; and
select one or more pertinent sectors, amongst the set of sectors, related to
the borrower’s sector; and
a prediction unit (220) configured to apply the plurality of relevant attributes and the one or more pertinent sectors to a pre-trained model (120) in order to dynamically predict the credit risk (122) of the borrower, wherein the credit risk (122) of the borrower is predicted by classifying the borrower in a risk class amongst a plurality of risk classes.
7. The system (102) as claimed in claim 6, wherein:

the borrower data (104) further comprises:
credit data (106) comprising at least one of a loan account activity of the borrower, a monthly equated monthly instalment (EMI) history and late payments;
bureau data (108) comprising at least one of a credit grade and a sub-credit grade associated with the borrower;
personal data (110) comprising at least one of a borrower address, marital status, occupation, annual income and credit rating;
the macro-economic data (112) comprises at least one of stock indices, financial reports and inflation reports; and
the media data (114) comprises a plurality of news articles and a plurality of social media feeds.
8. The system (102) as claimed in claim 6, wherein the computation unit (212) further
comprises:
a refinement unit (212A) configured to refine the borrower data (104) in order to convert each of the borrower data (104) to a required data type;
an attribute extraction unit (212B) configured to extract a plurality of attributes from the borrower data (104) refined;
an imputation unit (212C) configured to perform a statistical imputation on the plurality of attributes;
an encoding unit (212D) configured to encode the plurality of attributes after being statistically imputed by performing at least one of a label encoding, a one-hot encoding and a cyclic encoding, wherein:
label encoding is performed on a categorical and ordinal attribute; one-hot encoding is performed on a categorical and nominal attribute; and cyclic encoding is performed on an attribute having date-time format; and an attribute processing unit (212E) configured to perform a permutation-based attribute importance to compute the plurality of relevant attributes.
9. The system (102) as claimed in claim 6, wherein the pre-trained model (120) is at least one of a mid-fusion model and a late-fusion model.
10. The system (102) as claimed in claim 6, wherein the plurality of risk classes comprises at least one of a fully-paid class, a charged off class, a late class, a normal class and an in-grace period class.

Documents

Application Documents

# Name Date
1 202121013439-STATEMENT OF UNDERTAKING (FORM 3) [26-03-2021(online)].pdf 2021-03-26
2 202121013439-POWER OF AUTHORITY [26-03-2021(online)].pdf 2021-03-26
3 202121013439-FORM 18 [26-03-2021(online)].pdf 2021-03-26
4 202121013439-FORM 1 [26-03-2021(online)].pdf 2021-03-26
5 202121013439-FIGURE OF ABSTRACT [26-03-2021(online)].pdf 2021-03-26
6 202121013439-DRAWINGS [26-03-2021(online)].pdf 2021-03-26
7 202121013439-DECLARATION OF INVENTORSHIP (FORM 5) [26-03-2021(online)].pdf 2021-03-26
8 202121013439-COMPLETE SPECIFICATION [26-03-2021(online)].pdf 2021-03-26
9 202121013439-Proof of Right [15-07-2021(online)].pdf 2021-07-15
10 Abstract1.jpg 2021-10-19
11 202121013439-FER.pdf 2022-10-19
12 202121013439-OTHERS [13-04-2023(online)].pdf 2023-04-13
13 202121013439-FER_SER_REPLY [13-04-2023(online)].pdf 2023-04-13
14 202121013439-DRAWING [13-04-2023(online)].pdf 2023-04-13
15 202121013439-CLAIMS [13-04-2023(online)].pdf 2023-04-13

Search Strategy

1 search_strategy_1110E_11-10-2022.pdf