Abstract: A system to determine loan risk propensity of a subscriber of a telecom service is disclosed. The system generates a risk propensity model based upon telecom related parameters of a plurality of subscribers of the telecom service; receives telecom data pertaining to the subscriber; and applies the risk propensity model to the telecom data to generate loan risk propensity of the subscriber. The risk propensity model is generated based upon filtering the plurality of subscribers to generate a filtered dataset of the plurality of subscribers; aggregating the telecom related parameters of the filtered dataset to generate an aggregate data; reducing variables in the aggregate data; generating cleaned aggregate data based on the reduced variables; and using one or more analytic techniques on the cleaned aggregate data to generate the risk propensity model. Corresponding method is described.
Claims:
1. A system to determine loan risk propensity of a subscriber of a telecom service, the system comprising:
one or more processors;
a modeler unit to control the one or more processors to generate a risk propensity model based upon telecom related parameters of a plurality of subscribers of the telecom service;
a receiver to control the one or more processors to receive telecom data pertaining to said subscriber; and
a loan risk propensity generator to control the one or more processors to apply said risk propensity model to said telecom data to generate loan risk propensity of said subscriber.
2. The system of claim 1, wherein said modeler unit filters said plurality of subscribers to generate a filtered dataset of said plurality of subscribers; aggregates said telecom related parameters of said filtered dataset to generate an aggregate data; reduces variables in said aggregate data; generates cleaned aggregate data based on said reduced variables; and uses one or more analytic techniques on said cleaned aggregate data to generate said risk propensity model.
3. The system of claim 2,wherein said filters pertain to any or a combination of tenure with said telecom service of each of said plurality of subscribers and category of telecom connection of each of said plurality of subscribers with said telecom service.
4. The system of claim 2, wherein said telecom related parameters comprise any or a combination of demographics of said filtered dataset, the telecom service usage data of said filtered dataset, and the telecom service bill payment data of said filtered dataset.
5. The system of claim 2, wherein said modeler unit applies any or a combination of correlation, multicollinearility and information value to said aggregate data to reduce said variables in said aggregate data.
6. The system of claim 2, wherein the one or more analytic techniques comprise any or a combination of regression, Random Forest, XGBoost, K-nearest neighbor, likelihood-ratio, and deep learning.
7. The system of claim 1, wherein the system further comprises a loan proposal unit to control the one or more processors to propose a loan to said subscriber based upon said loan risk propensity, purpose of said loan and amount of said loan.
8. A method to determine loan risk propensity of a subscriber of a telecom service, the method comprising:
generating, at a first computing device, a risk propensity model based upon telecom related parameters of a plurality of subscribers of said telecom service;
receiving, at a second computing device operatively connected to said first computing device, telecom data pertaining to said subscriber; and
generating, at a third computing device operatively connected to said first computing device and said second computing device loan risk propensity of said subscriber by applying said risk propensity model to said telecom data,
wherein functionalities of any or a combination of said first computing device, said second computing device and said third computing device are located in a single computing device.
9. The method of claim 8, wherein generating the risk propensity model comprises:
filtering said plurality of subscribers to generate a filtered dataset of said plurality of subscribers;
aggregating said telecom related parameters of said filtered dataset to generate an aggregate data;
reducing variables in the aggregate data ;
generating cleaned aggregate data based on said reduced variables; and
using one or more analytic techniques on said cleaned aggregate data to generate said risk propensity model.
10. The method of claim 8, wherein the method further comprises proposing a loan to said subscriber based upon said loan risk propensity, purpose of said loan and amount of said loan.
, Description:
FIELD OF DISCLOSURE
[0001] The present disclosure relates to systems for determining loan risks. In particular, it pertains to a system that uses telecom related data to determine such risks.
BACKGROUND OF THE DISCLOSURE
[0002] The background description includes information that may be useful in understanding the present invention. It is not an admission that any of the information provided herein is prior art or relevant to the presently claimed invention, or that any publication specifically or implicitly referenced is prior art.
[0003] Credit worthiness (interchangeably represented as loan risk propensity, credit rating or risk rating) of any entity is a well known concept used in context of giant global organizations to single individuals. Depending upon many factors, risk rating essentially indicates the entity’s ability and/or propensity to pay back its loans (that may be existing or planned), and has a direct bearing on the entity’s ability to get such loans ( for instance from financial institutions or even shopkeepers) as well as cost of loan take,.
[0004] Risk rating agencies (interchangeably termed as credit rating agencies or CRAs) exist at different scale of operations, segments and purposes. For instance, debt instruments rated by CRAs include government bonds, corporate bonds, CDs, municipal bonds, preferred stock, and collateralized securities, such as mortgage-backed securities and collateralized debt obligations. Globally, three credit rating agencies namely Moody's Investors Service, Standard & Poor's (S&P) and Fitch Ratings have a combined share of about 95 % of this segment.
[0005] At the retail/individual level there exist other agencies. For instance in India CIBIL/Experian offer credit reports of different entities, generally smaller enterprises or individuals. These credit reports are usually represented as an overall ‘credit score’. Such scores are increasingly being demanded by any loan granting agency that offers loans to this segment. For example, a bank usually asks for CIBIL score to determine eligibility of an individual for household loan.
[0006] However, scores as above are not available for everyone since only people with a past credit history are scored by such bureaus. Since credit penetration in developing countries such as India is anyway very low, what this means is that a very large segment of population simply does not have such credit history, hence denying them loans for the economic growth. As can be appreciated, this is a vicious circle not easy to break since in absence of a loan, loan repayment history will never be created and in the absence of a loan history it will be difficult to get a loan. Further, such scores, even if available, do not factor in parameters such as amount of loan required, purpose of loan, loan tenure etc to determine a person’s eligibility for the loan
[0007] Hence, there is a need in the art for a system that need not necessarily rely only upon past credit history of a subscriber but instead can determine loan risk propensity of those who do not have such a history. Besides, the system should be able to adapt itself to varying scenarios such as amount, purpose and tenure of loam sought to determine eligibility of a subscriber for the same.
[0008] All publications herein are incorporated by reference to the same extent as if each individual publication or patent application were specifically and individually indicated to be incorporated by reference. Where a definition or use of a term in an incorporated reference is inconsistent or contrary to the definition of that term provided herein, the definition of that term provided herein applies and the definition of that term in the reference does not apply.
[0009] In some embodiments, the numbers expressing quantities or dimensions of items, and so forth, used to describe and claim certain embodiments of the invention are to be understood as being modified in some instances by the term “about.” Accordingly, in some embodiments, the numerical parameters set forth in the written description and attached claims are approximations that can vary depending upon the desired properties sought to be obtained by a particular embodiment. In some embodiments, the numerical parameters should be construed in light of the number of reported significant digits and by applying ordinary rounding techniques. Notwithstanding that the numerical ranges and parameters setting forth the broad scope of some embodiments of the invention are approximations, the numerical values set forth in the specific examples are reported as precisely as practicable. The numerical values presented in some embodiments of the invention may contain certain errors necessarily resulting from the standard deviation found in their respective testing measurements.
[00010] As used in the description herein and throughout the claims that follow, the meaning of “a,” “an,” and “the” includes plural reference unless the context clearly dictates otherwise. Also, as used in the description herein, the meaning of “in” includes “in” and “on” unless the context clearly dictates otherwise.
[00011] The recitation of ranges of values herein is merely intended to serve as a shorthand method of referring individually to each separate value falling within the range. Unless otherwise indicated herein, each individual value is incorporated into the specification as if it were individually recited herein. All methods described herein can be performed in any suitable order unless otherwise indicated herein or otherwise clearly contradicted by context. The use of any and all examples, or exemplary language (e.g. “such as”) provided with respect to certain embodiments herein is intended merely to better illuminate the invention and does not pose a limitation on the scope of the invention otherwise claimed. No language in the specification should be construed as indicating any non-claimed element essential to the practice of the invention.
[00012] Groupings of alternative elements or embodiments of the invention disclosed herein are not to be construed as limitations. Each group member can be referred to and claimed individually or in any combination with other members of the group or other elements found herein. One or more members of a group can be included in, or deleted from, a group for reasons of convenience and/or patentability. When any such inclusion or deletion occurs, the specification is herein deemed to contain the group as modified thus fulfilling the written description of all groups used in the appended claims.
OBJECTS OF THE INVENTION
[00013] Some of the objects of the present disclosure, which at least one embodiment herein satisfies are as listed herein below.
[00014] It is an object of the present disclosure to provide for a system that provides loan risk propensities of entities that have no /insufficient past credit histories.
[00015] It is another object of the present disclosure to provide for a system that adapts to loan risk propensity of an entity to propose loan for the entity accordingly.
[00016] It is yet another object of the present disclosure to provide for a system that reduces cost of credit underwriting.
SUMMARY
[00017] The present disclosure mainly relates to system that uses telecom related data to determine loan risks. In particular it pertains to a system that uses such data to determine loan risk propensity of a subscriber.
[00018] In an aspect, a system to determine loan risk propensity of a subscriber of a telecom service as disclosed herein can include: one or more processors; a modeler unit to control the one or more processors to generate a risk propensity model based upon telecom related parameters of a plurality of subscribers of the telecom service; a receiver to control the one or more processors to receive telecom data pertaining to the subscriber; and a loan risk propensity generator to control the one or more processors to apply the risk propensity model to the telecom data to generate loan risk propensity of the subscriber.
[00019] In another aspect, the modeler unit can filter the plurality of subscribers to generate a filtered dataset of the plurality of subscribers; can aggregate the telecom related parameters of the filtered dataset to generate an aggregate data; can reduce variables in the aggregate data; can generate cleaned aggregate data based on the reduced variables; and can use one or more analytic techniques on the cleaned aggregate data to generate the risk propensity model.
[00020] In yet another aspect, the filters can pertain to any or a combination of tenure with the telecom service of each of the plurality of subscribers and category of telecom connection of each of the plurality of subscribers with the telecom service.
[00021] In an aspect, the telecom related parameters can include any or a combination of demographics of the filtered dataset, the telecom service usage data of the filtered dataset, and the telecom service bill payment data of the filtered dataset.
[00022] In another aspect, the modeler unit can apply any or a combination of correlation, multicollinearility and information value to the aggregate data to reduce the variables in the aggregate data.
[00023] In yet another aspect, the one or more analytic techniques can include any or a combination of regression, Random Forest, XGBoost, K-nearest neighbor, likelihood-ratio, and deep learning.
[00024] In an aspect, the proposed system can further include a loan proposal unit to control the one or more processors to propose a loan to the subscriber based upon the loan risk propensity, purpose of the loan and amount of the loan.
[00025] In an aspect, present disclosure elaborates upon a method to determine loan risk propensity of a subscriber of a telecom service, the method including: generating, at a first computing device, a risk propensity model based upon telecom related parameters of a plurality of subscribers of the telecom service; receiving, at a second computing device operatively connected to the first computing device, telecom data pertaining to the subscriber; and generating, at a third computing device operatively connected to the first computing device and the second computing device , loan risk propensity of the subscriber by applying the risk propensity model to the telecom data, wherein functionalities of any or a combination of the first computing device, the second computing device and the third computing device can be located in a single computing device.
[00026] In another aspect, generating the risk propensity model can include: filtering the plurality of subscribers to generate a filtered dataset of the plurality of subscribers; aggregating the telecom related parameters of the filtered dataset to generate an aggregate data; reducing variables in the aggregate data; generating cleaned aggregate data based on the reduced variables; and using one or more analytic techniques on the cleaned aggregate data to generate the risk propensity model.
[00027] In yet another aspect, the method can include proposing a loan to the subscriber based upon the loan risk propensity, purpose of the loan and amount of the loan.
[00028] Technical problem solved by the present invention is that presently, there are no means to determine loan risk propensities of entities with no or limited past credit history.
[00029] To solve the above recited and other problems available in the prior-art, the present invention provides the following technical solution: It uses telecom data of a subscriber to determine the subscriber’s loan risk propensity, using a novel modeling technique described herein. Since telecom services coverage is very high, present invention enables loan risk propensities of a very large number of subscribers to be determined with consequence advantages like easy loan availability and low cost of underwriting .
[00030] Within the scope of this application it is expressly envisaged that the various aspects, embodiments, examples and alternatives set out in the preceding paragraphs, in the claims and/or in the following description and drawings, and in particular the individual features thereof, may be taken independently or in any combination. Features described in connection with one embodiment are applicable to all embodiments, unless such features are incompatible.
[00031] Various objects, features, aspects and advantages of the present disclosure will become more apparent from the following detailed description of preferred embodiments, along with the accompanying drawing figures in which like numerals represent like features.
BRIEF DESCRIPTION OF DRAWINGS
[00032] The accompanying drawings are included to provide a further understanding of the present disclosure, and are incorporated in and constitute a part of this specification. The drawings illustrate exemplary embodiments of the present disclosure and, together with the description, serve to explain the principles of the present disclosure. The diagrams are for illustration only, which thus is not a limitation of the present disclosure, and wherein:
[00033] FIG. 1 illustrates architecture of system proposed to illustrate its overall working in accordance with an exemplary embodiment of the present disclosure.
[00034] FIG. 2 illustrates functional units of system proposed in accordance with an exemplary embodiment of the present disclosure.
[00035] FIG. 3A and FIG. 3B illustrate examples of working of system proposed in accordance with an exemplary embodiment of the present disclosure.
[00036] FIGs. 4A and 4B illustrate how telecom related parameters of a plurality of subscriber can be aggregated into single view tables for generating a loan risk propensity model used in system proposed in accordance with an exemplary embodiment of the present disclosure, while FIG.4C illustrates how the model developed can be fine tuned and validated.
[00037] FIG. 5 illustrates a method of working of system proposed in accordance with an exemplary embodiment of the present disclosure.
[00038] FIG. 6 illustrates an analysis of performance of system proposed against existing systems in accordance with an exemplary embodiment of the present disclosure.
DETAILED DESCRIPTION
[00039] The following is a detailed description of embodiments of the disclosure depicted in the accompanying drawings. The embodiments are in such detail as to clearly communicate the disclosure. However, the amount of detail offered is not intended to limit the anticipated variations of embodiments; on the contrary, the intention is to cover all modifications, equivalents, and alternatives falling within the spirit and scope of the present disclosure as defined by the appended claims.
[00040] In the following description, numerous specific details are set forth in order to provide a thorough understanding of embodiments of the present invention. It will be apparent to one skilled in the art that embodiments of the present invention may be practiced without some of these specific details.
[00041] Embodiments of the present invention include various steps, which will be described below. The steps may be performed by hardware components or may be embodied in machine-executable instructions, which may be used to cause a general-purpose or special-purpose processor programmed with the instructions to perform the steps. Alternatively, steps may be performed by a combination of hardware, software, and firmware and/or by human operators.
[00042] Various methods described herein may be practiced by combining one or more machine-readable storage media containing the code according to the present invention with appropriate standard computer hardware to execute the code contained therein. An apparatus for practicing various embodiments of the present invention may involve one or more computers (or one or more processors within a single computer) and storage systems containing or having network access to computer program(s) coded in accordance with various methods described herein, and the method steps of the invention could be accomplished by modules, routines, subroutines, or subparts of a computer program product.
[00043] If the specification states a component or feature “may”, “can”, “could”, or “might” be included or have a characteristic, that particular component or feature is not required to be included or have the characteristic.
[00044] As used in the description herein and throughout the claims that follow, the meaning of “a,” “an,” and “the” includes plural reference unless the context clearly dictates otherwise. Also, as used in the description herein, the meaning of “in” includes “in” and “on” unless the context clearly dictates otherwise.
[00045] Exemplary embodiments will now be described more fully hereinafter with reference to the accompanying drawings, in which exemplary embodiments are shown. These exemplary embodiments are provided only for illustrative purposes and so that this disclosure will be thorough and complete and will fully convey the scope of the invention to those of ordinary skill in the art. The invention disclosed may, however, be embodied in many different forms and should not be construed as limited to the embodiments set forth herein. Various modifications will be readily apparent to persons skilled in the art. The general principles defined herein may be applied to other embodiments and applications without departing from the spirit and scope of the invention. Moreover, all statements herein reciting embodiments of the invention, as well as specific examples thereof, are intended to encompass both structural and functional equivalents thereof. Additionally, it is intended that such equivalents include both currently known equivalents as well as equivalents developed in the future (i.e., any elements developed that perform the same function, regardless of structure). Also, the terminology and phraseology used is for the purpose of describing exemplary embodiments and should not be considered limiting. Thus, the present invention is to be accorded the widest scope encompassing numerous alternatives, modifications and equivalents consistent with the principles and features disclosed. For purpose of clarity, details relating to technical material that is known in the technical fields related to the invention have not been described in detail so as not to unnecessarily obscure the present invention.
[00046] Thus, for example, it will be appreciated by those of ordinary skill in the art that the diagrams, schematics, illustrations, and the like represent conceptual views or processes illustrating systems and methods embodying this invention. The functions of the various elements shown in the figures may be provided through the use of dedicated hardware as well as hardware capable of executing associated software. Similarly, any switches shown in the figures are conceptual only. Their function may be carried out through the operation of program logic, through dedicated logic, through the interaction of program control and dedicated logic, or even manually, the particular technique being selectable by the entity implementing this invention. Those of ordinary skill in the art further understand that the exemplary hardware, software, processes, methods, and/or operating systems described herein are for illustrative purposes and, thus, are not intended to be limited to any particular named element.
[00047] Embodiments of the present invention may be provided as a computer program product, which may include a machine-readable storage medium tangibly embodying thereon instructions, which may be used to program a computer (or other electronic devices) to perform a process. The term “machine-readable storage medium” or “computer-readable storage medium” includes, but is not limited to, fixed (hard) drives, magnetic tape, floppy diskettes, optical disks, compact disc read-only memories (CD-ROMs), and magneto-optical disks, semiconductor memories, such as ROMs, PROMs, random access memories (RAMs), programmable read-only memories (PROMs), erasable PROMs (EPROMs), electrically erasable PROMs (EEPROMs), flash memory, magnetic or optical cards, or other type of media/machine-readable medium suitable for storing electronic instructions (e.g., computer programming code, such as software or firmware).A machine-readable medium may include a non-transitory medium in which data can be stored and that does not include carrier waves and/or transitory electronic signals propagating wirelessly or over wired connections. Examples of a non-transitory medium may include, but are not limited to, a magnetic disk or tape, optical storage media such as compact disk (CD) or digital versatile disk (DVD), flash memory, memory or memory devices. A computer-program product may include code and/or machine-executable instructions that may represent a procedure, a function, a subprogram, a program, a routine, a subroutine, a module, a software package, a class, or any combination of instructions, data structures, or program statements. A code segment may be coupled to another code segment or a hardware circuit by passing and/or receiving information, data, arguments, parameters, or memory contents. Information, arguments, parameters, data, etc. may be passed, forwarded, or transmitted via any suitable means including memory sharing, message passing, token passing, network transmission, etc.
[00048] Furthermore, embodiments may be implemented by hardware, software, firmware, middleware, microcode, hardware description languages, or any combination thereof. When implemented in software, firmware, middleware or microcode, the program code or code segments to perform the necessary tasks (e.g., a computer-program product) may be stored in a machine-readable medium. A processor(s) may perform the necessary tasks.
[00049] Systems depicted in some of the figures may be provided in various configurations. In some embodiments, the systems may be configured as a distributed system where one or more components of the system are distributed across one or more networks in a cloud computing system.
[00050] Each of the appended claims defines a separate invention, which for infringement purposes is recognized as including equivalents to the various elements or limitations specified in the claims. Depending on the context, all references below to the "invention" may in some cases refer to certain specific embodiments only. In other cases it will be recognized that references to the "invention" will refer to subject matter recited in one or more, but not necessarily all, of the claims.
[00051] All methods described herein can be performed in any suitable order unless otherwise indicated herein or otherwise clearly contradicted by context. The use of any and all examples, or exemplary language (e.g., “such as”) provided with respect to certain embodiments herein is intended merely to better illuminate the invention and does not pose a limitation on the scope of the invention otherwise claimed. No language in the specification should be construed as indicating any non-claimed element essential to the practice of the invention.
[00052] Various terms as used herein are shown below. To the extent a term used in a claim is not defined below, it should be given the broadest definition persons in the pertinent art have given that term as reflected in printed publications and issued patents at the time of filing.
[00053] In an aspect, a system to determine loan risk propensity of a subscriber of a telecom service as disclosed herein can include: one or more processors; a modeler unit to control the one or more processors to generate a risk propensity model based upon telecom related parameters of a plurality of subscribers of the telecom service; a receiver to control the one or more processors to receive telecom data pertaining to the subscriber; and a loan risk propensity generator to control the one or more processors to apply the risk propensity model to the telecom data to generate loan risk propensity of the subscriber.
[00054] In another aspect, the modeler unit can filter the plurality of subscribers to generate a filtered dataset of the plurality of subscribers; can aggregate the telecom related parameters of the filtered dataset to generate an aggregate data; can reduce variables in the aggregate data; can generate cleaned aggregate data based on the reduced variables; and can use one or more analytic techniques on the cleaned aggregate data to generate the risk propensity model.
[00055] In yet another aspect, the filters can pertain to any or a combination of tenure with the telecom service of each of the plurality of subscribers and category of telecom connection of each of the plurality of subscribers with the telecom service.
[00056] In an aspect, the telecom related parameters can include any or a combination of demographics of the filtered dataset, the telecom service usage data of the filtered dataset, and the telecom service bill payment data of the filtered dataset.
[00057] In another aspect, the modeler unit can apply any or a combination of correlation, multicollinearility and information value to the aggregate data to reduce the variables in the aggregate data.
[00058] In yet another aspect, the one or more analytic techniques can include any or a combination of regression, Random Forest, XGBoost, K-nearest neighbor, likelihood-ratio, and deep learning.
[00059] In an aspect, the proposed system can further include a loan proposal unit to control the one or more processors to propose a loan to the subscriber based upon the loan risk propensity, purpose of the loan and amount of the loan.
[00060] In an aspect, present disclosure elaborates upon a method to determine loan risk propensity of a subscriber of a telecom service, the method including: generating, at a first computing device, a risk propensity model based upon telecom related parameters of a plurality of subscribers of the telecom service; receiving, at a second computing device operatively connected to the first computing device, telecom data pertaining to the subscriber; and generating, at a third computing device operatively connected to the first computing device and the second computing device , loan risk propensity of the subscriber by applying the risk propensity model to the telecom data, wherein functionalities of any or a combination of the first computing device, the second computing device and the third computing device can be located in a single computing device.
[00061] In another aspect, generating the risk propensity model can include: filtering the plurality of subscribers to generate a filtered dataset of the plurality of subscribers; aggregating the telecom related parameters of the filtered dataset to generate an aggregate data; reducing variables in the aggregate data; generating cleaned aggregate data based on the reduced variables; and using one or more analytic techniques on the cleaned aggregate data to generate the risk propensity model.
[00062] In yet another aspect, the method can include proposing a loan to the subscriber based upon the loan risk propensity, purpose of the loan and amount of the loan.
[00063] In an aspect, while existing systems rely on existing financial data of a customer(such as bank statements, past credit history etc.), proposed system seeks to analyze a customer’s/ subscriber’s behavior pertaining to telecom services provided to the subscriber to derive the subscriber’s loan risk propensity / risk rating/credit worthiness/credit rating ( the terms used interchangeably herein). Proposed system uses a telecom service’s subscriber’s telecom data, without using any of the existing financial parameters. Behavior related to phone usage (incoming, outgoing, roaming, data usage, SMS usage, peak time & off-peak time usage, payment frequency and mode, late payments and so on) can be analyzed using proposed system along with customer demographics, handset details and various other parameters to arrive at risk rating of a subscriber and to predict possibility of default on a loan by the subscriber. As can be readily understood, the system aims to cover entities/individuals/subscribers/organizations not presently covered by existing rating agencies/ credit bureaus that use existing financial data and history for the purpose. Hence, the system can find ready adaptability, for instance, amongst micro-industries, students seeking loans, workers just entering an employment stream, unorganized sector personnel (which is a huge market segment by itself) etc.
[00064] Proposed system can be well-suited for unsecured lending for loans less than Rs. 100,000/=, for example. Credit ratings provided by the proposed system correlate well with credit bureau scores determined on basis of traditional methodologies. Hence, proposed system can be used in place of credit bureaus.
[00065] In an exemplary embodiment, proposed system can build a model on a subscriber /customer base of a telecom service that can initially number about 18 million to start with, which may get reduced to about 6 million after using the filters as elaborated further. Some of this remaining data (say 80 %) may be used for ‘training’ purpose ( i.e. for building the model described herein ) and the remaining for ‘validation’ purpose ( i. e., for verifying the results provided by model described herein by comparing such results to those provided by other systems such as those of another credit rating agency ).
[00066] In an exemplary embodiment, proposed system can develop a risk propensity model (interchangeably termed as model herein) based upon a dataset of telecom related parameters of postpaid subscribers of a telecom service (TS). While aggregates of calls made by a subscriber are used, call data records may not be used in the model thus preserving subscriber privacy.
[00067] As elaborated further, the model can be used to generate loan risk propensity for a subscriber (interchangeably termed as an entity herein). The loan risk propensity generated can indicate to another entity chances the another entity can take with respect to risk of default on loan taken by the entity. For example, if loan risk propensity of an entity ( say individual ‘X’) generated by proposed system is 0.8, it can indicate to a loan granter that there is an 80 % probability that loan given to ‘X” will be fully recovered according to the terms of the loan.
[00068] Various filters maybe used on the initial subscriber base to start with. For instance, a subscriber who has been with the TS for less than a year maybe removed from the dataset so that such subscriber’s telecom related parameters are not considered for building the model. Likewise, corporate-owned corporate-paid connections may be removed to consider only individual / retail subscribers. It can be appreciated that a variety of filters can be implemented on the dataset to generate a filtered dataset as required.
[00069] Further, proposed system can aggregate values of telecom related parameters of subscribers that are in the filtered dataset over a pre-determined period (say a year). Such telecom related parameters can include, for example, any or a combination of Subscriber demographics: Gender, age, city, pincode, vintage. Bill Payment Behavior: payments done on time, late payments, mode of payment, Usage Data: Incoming, outgoing, roaming, data usage, sms usage, peak time & off-peak time usage, 3G/4G usage, etc., and barring and disconnections: Number of times barred and disconnected (temporary and permanent)
[00070] All such data can be aggregated in the format of single view tables aggregated at a subscriber level for use into building the model. ( For instance, as elaborated in FIG. 4A) . In this manner, an aggregate data can be created for processing as further elaborated.
[00071] Further, proposed system can subject the aggregate data to various cleaning techniques/operations to reduce variables (telecom related parameters) of the aggregate data while retaining its predictive ability. Such operations can include, for instance, removing variables with high correlation. For instance, only one of the variables with a correlation coefficient of more than 0.6 may be retained to reduce correlated variables in the data. For instance, it may be determined that people between age of 50-60 years have a very high correlation to paying their bills on time and for such a data set of subscribers, timely payment of bills may not be considered as a variable.
[00072] Variables may be checked for multicollinearity using, for instance, VIF (variance inflation factor) and variables with a VIF or more than 2 may be removed from the aggregate data. Further, variables may be ranked in terms of their information value and those with information value being below a pre-determined value may be removed from the aggregate data. Finally variables with their information value above a pre-determined value may be retained and others removed from the aggregate data. For instance only the top 30 variables may be used in building the model. These procedures can finally lead to a ‘cleaned aggregate data’.
[00073] Thereafter, proposed system can use various analytic techniques such as logical regression to generate a ‘risk propensity model” using the cleaned aggregate data of all the subscribers in the filtered dataset. As known, regression analysis leads to an equation wherein the coefficients in the equation define the relationship between each independent variable and the dependent variable. Once determined, the equation can as well be used to enter values for the independent variables to predict mean value of the dependent variable. In other words, model generated can be provided input of various independent variables and can output the probability of prediction associated with the dependent variable. The independent variables can be the telecom related parameters of an entity, and the output can be loan risk propensity of the entity.
[00074] In an aspect, to generate the risk propensity model proposed system can convert values of all categorical variables of the cleaned aggregate data ( of all the subscribers in the filtered dataset) to label encoding, and can normalize values of all the numerical variables of the cleaned aggregate data to bring all of them to a single scale. As already said, the model can be built on a training sample, and can be validated on a test sample to confirm the model accuracy.
[00075] Categorical variables are defined as categories and can’t be used in mathematical operations. They need to be treated separately. For example: Month (Jan, Feb, Mar,…) or cities (Delhi, Bangalore, Chennai, etc). Numerical Variables are values which are numeric in nature, and mathematical operations can be performed on these values. For example, amount of bill paid, number of days delay in payment of bill, STD minutes consumed during a billing period etc. etc.
[00076] Once the model has been generated, proposed system can take as input as input values of all variables (for instance, top 30 independent (predictor) variables as described above) and provide as output probability of a prediction associated with a dependent (response) variable. As can be readily appreciated, the input variables to the proposed system can be any or a combination of various telecom data as described above for an entity (for instance values of top 30 telecom related parameters of a subscriber ‘Y’) and as output proposed system can generate a credit rating/loan risk propensity of the subscriber ‘Y’).
[00077] Besides logistic regression, proposed system can use other techniques such as Random Forest, XGBoost, K-nearest neighbor, likelihood-ratio, deep learning or any combination of these to generate the risk propensity model.
[00078] As can be appreciated, different models can be generated for different telecom circles in a country ( that may/may not be being operated by same TS) to capture behavior of different subscribers in different circles. For instance, in India one model can be built for Punjab and another for Kerala.
[00079] After a model has been generated for a circle, various subscribers of a circle can be scored on the model and divided into various risk categories based upon pre-determined ranges/deciles of loan risk propensities. People ( subscribers) below a credit score can be classified as ‘Bad’ while others can be further divided into High Risk, Medium Risk and Low Risk based on the deciles in which they fall. .
[00080] The loan risk propensity scores /values can be further validated against credit ratings generated by a credit bureau. Credit Information Bureau (India) Limited is (CIBIL) is one such bureau. In an exemplary embodiment, a set of 1 million subscribers was found to have a very high correlation (decile wise) between the CIBIL scores and score predicted by model proposed herein, thereby proving high accuracy of model disclosed.
[00081] The accuracy of the model proposed when evaluated as described above against CIBIL scores using known accuracy measurement techniques such as KS (Kolmogorov- Smimov) , AUC ( area under curve) and GINI lead to a result of 52 % using KS, 85 % using AUC and 0.7 % using GINI.
[00082] FIG. 1 illustrates architecture of system proposed to illustrate its overall working in accordance with an exemplary embodiment of the present disclosure.
[00083] As shown in FIG.1, proposed system 102 can receive a subscriber telecom data 104. Thereafter, proposed system can apply risk propensity model as elaborated above to the subscriber telecom data 104 to generate subscriber loan risk propensity shown as 106.
[00084] It can be readily appreciated that subscriber telecom data need not be necessarily provided by the subscriber himself/herself. Any authorized entity (for example a loan granter) can be given access to the subscriber telecom data by the telecom service provider (TSP ) of the subscriber and thereafter, the loan granter can use the data to generate a credit ranking of the subscriber. At the same time, a subscriber can as well generate his/ her own loan risk propensity in a similar manner, when authorized to access and use proposed system.
[00085] FIG. 2 illustrates functional units of system proposed in accordance with an exemplary embodiment of the present disclosure.
[00086] FIG. 2 illustrates various components of a proposed system 102. In an example, the system 102 may be implemented in a server. The system 102 may be in communication with one/more computing devices through a communication network to receive various inputs and generate outputs as further elaborated.
[00087] In an implementation, proposed system 102 as described herein may be implemented in a variety of types of computing device, including without limitation, a desktop computer system, a data entry terminal, a laptop computer, a notebook computer, a tablet computer, a handheld personal data assistant, a smartphone, a body-worn computing device incorporated into clothing, a computing device integrated into a vehicle (e.g., a car, a bicycle, etc.), a server, a cluster of servers, a server farm, etc.
[00088] In another aspect, relevant units as described further of the proposed system 102 can be configured to be operatively connected to a website, or to a mobile application that can be downloaded on a mobile device that can connect to Internet. In such fashion the proposed system can be available 24*7 to its various users. Any other manner of implementation of the proposed system or a part thereof is well within the scope of the present disclosure/invention.
[00089] In an aspect, system 102 to determine loan risk propensity of a subscriber of a telecom service may include one or more processor(s) 202. The one or more processor(s) 202 may be implemented as one or more microprocessors, microcomputers, microcontrollers, digital signal processors, central processing units, logic circuitries, and/or any devices that manipulate data based on operational instructions. Among other capabilities, the one or more processor(s) 202 are configured to fetch and execute computer-readable instructions stored in a memory 204 of the system 102. The memory 204 may store one or more computer-readable instructions or routines, which may be fetched and executed to create or share the data units over a network service. The memory 204 may include any non-transitory storage device including, for example, volatile memory such as RAM, or non-volatile memory such as EPROM, flash memory, and the like.
[00090] The system 102 may also include an interface(s) 206. The interface(s) 206 may include a variety of interfaces, for example, interfaces for data input and output devices, referred to as I/O devices, storage devices, and the like. The interface(s) 206 may facilitate communication of the system 102 with various devices coupled to the system 102. The interface(s) 206 may also provide a communication pathway for one or more components of the system 102. Examples of such components include, but are not limited to, loan risk propensity generator and database, as further described.
[00091] In an aspect, other components of the proposed system 102 can include a modeler unit 208, a receiver 210, a loan risk propensity generator 212, a database 214 and a loan proposal unit 216, besides other components/units.
[00092] Components and units as above and further described may be implemented as a combination of hardware and programming (for example, programmable instructions) to implement their functionalities either themselves or using processor(s) 202. In examples described herein, such combinations of hardware and programming may be implemented in several different ways. For example, the programming for the processor(s) 202 may be processor executable instructions stored on a non-transitory machine-readable storage medium and the hardware for the processor(s) 202 may include a processing resource (for example, one or more processors), to execute such instructions. The system 102 may include the machine-readable storage medium storing the instructions and the processing resource to execute the instructions, or the machine-readable storage medium may be separate but accessible to system 102 and the processing resource. In other examples, the units as above and further described and/or processor(s) 202 may be implemented by electronic circuitry.
[00093] The database 214 may include data that is either stored or generated as a result of functionalities implemented by any of the components being described.
Modeler Unit 208
[00094] In an aspect, modeler unit 208 can control the one or more processors 202 to generate a risk propensity model based upon telecom related parameters of a plurality of subscribers of the telecom service.
[00095] In another aspect, modeler unit 208 can filter the plurality of subscribers to generate a filtered dataset of the plurality of subscribers and can aggregate the telecom related parameters of the filtered dataset to generate an aggregate data.
[00096] Thereafter modeler unit 208 can reduce variables in the aggregate data and generate cleaned aggregate data based on the reduced variables.
[00097] Further, modeler unit 208 can use one or more analytic techniques on the cleaned aggregate data to generate the risk propensity model.
[00098] The filters can pertain to any or a combination of tenure with the telecom service of each of the plurality of subscribers and category of telecom connection of each of the plurality of subscribers with the TSP.
[00099] The telecom related parameters can include any or a combination of demographics of the filtered dataset, telecom services usage data of the filtered dataset, and telecom services bill payment data of the filtered dataset.
[000100] Modeler unit 208 can apply any or a combination of correlation, multicollinearity and information value to the aggregate data to reduce the variables in the aggregate data.
[000101] The one or more analytic techniques can include any or a combination of regression, Random Forest, XGBoost, K-nearest neighbor, likelihood-ratio, and deep learning.
[000102] In an exemplary embodiment, modeler unit 208 can be one or more processors dedicatedly used to perform functionalities of modeler unit 208 as elaborated above.
[000103] Various ways can be used to provide initial subscriber data of the telecom service to modeler unit 208.For instance, receiver 210 can be used for the purpose.
Receiver 210
[000104] In an aspect, receive 210 can control the one or more processors 202 to receive telecom data pertaining to a subscriber.
[000105] Receiver210 can as well be used to provide initial subscriber data of the telecom service to modeler unit 208
[000106] In an exemplary embodiment, receiver 210 can be any device, such as transceiver, having a transmitter or a receiver or both a transmitter and a receiver that are combined and share common circuitry or a single housing for receiving and/or transmitting various data to/from the proposed system
Loan Risk Propensity Generator 212
[000107] In an aspect, loan risk propensity generator 212 can control the one or more processors 202 to apply the risk propensity model to the telecom data to generate loan risk propensity of the subscriber.
[000108] In an exemplary embodiment, loan risk propensity generator 212 can be one or more processors dedicatedly used to perform functionalities of generator 212 as elaborated above.
Database 214
[000109] In an aspect, database 214 can store all data pertaining to the proposed system 102 as being described herein, and can make such data available to other units. Such data can include, for instance, demographic and other telecom related parameters pertaining to all subscribers of a telecom circle, filtered datasets as per queries set by a system administrator of proposed system, cleaned aggregate data and various attributes ( for example coefficients of equation formed) of the risk propensity model generated by the proposed system.
[000110] Database 214 can enable storage and retrieval of various data therein based upon instructions received from one or more processors 202 under control of units as described above.
Loan Proposal Unit 216
[000111] In an aspect, loan proposal unit 216 can control the one or more processors 202 to propose a loan to the subscriber based upon said loan risk propensity, purpose of said loan and amount of said loan.
[000112] Using unit 216, proposed system can propose loans to different subscribers adapted to their loan risk propensities as generated by it. Amount, tenure and interest of the loan can be varied. In this manner, proposed system can build different credit scores/ratings for specific use cases that may be more accurate than one universal credit score/rating. Risk-based pricing of an asset can also be implemented base on loan risk propensity of an entity.
[000113] In an exemplary embodiment, loan proposal unit 216 can be one or more processors dedicatedly used to perform functionalities of unit 216 as elaborated above.
[000114] It would be appreciated that components and units as described above are only exemplary units and any other unit or sub-unit can be included as part of the proposed system. These units too can be merged or divided into super-units or sub-units as may be configured and can be spread across one or more computing devices operatively connected to each other using appropriate communication technologies.
[000115] Further, although the proposed system has been elaborated as above to include all the main units, it is completely possible that actual implementations may include only a part of the proposed units or a combination of those or a division of those into sub-units in various combinations across multiple devices that can be operatively coupled with each other, including in the cloud. Further the units can be configured in any sequence to achieve objectives elaborated. Also, it can be appreciated that proposed system can be configured in a computing device or across a plurality of computing devices operatively connected with each other, wherein the computing devices can be any of a computer, a laptop, a smart phone, an Internet enabled mobile device and the like. Therefore, all possible modifications, implementations and embodiments of where and how the proposed system is configured are well within the scope of the present invention.
[000116] FIG. 3A and FIG. 3B illustrate examples of working of system proposed in accordance with an exemplary embodiment of the present disclosure.
[000117] In an aspect, proposed system can use a mobile application that can be downloaded onto a smartphone. The mobile application can operatively connect the smartphone (using any data communication means, for instance Internet) while the proposed system can reside at a central server/cloud. The mobile application can provide appropriate interfaces on the smartphone.
[000118] As shown at FIG. 3A, a subscriber can provide data as asked in FIG.3A at 302 regarding various telecom data pertaining to the subscriber and required by the proposed system. The subscriber can also indicate loan amount sought by him and its purpose, as shown at 304.Thereafter, the subscriber can press ‘Submit’ button 308 on the mobile device.
[000119] Thereafter the telecom data of the subscriber can be sent to the proposed system that can apply the apply risk propensity model as described above to the telecom data to generate loan risk propensity of the subscriber.
[000120] As illustrated in FIG.3B, the loan risk propensity of the subscriber can then be advised to the subscriber as shown at 310. Besides, proposed system can also make a loan proposal for the subscriber as shown at 312, indicating total loan amount that can be offered, tenure, interest and equated monthly installments (EMI). Using Print button 314, the subscriber can print the loan proposal, or can use End button 314 to end the process.
[000121] It can be readily appreciated that the subscriber may need to be duly authorized to operate system proposed. Further, any other entity so authorized – for instance, a loan granter or a credit rating bureau can similarly use the system proposed.
[000122] FIGs. 4A and 4B illustrate how telecom related parameters of a plurality of subscriber can be aggregated into single view tables for generating a risk propensity model used in system proposed in accordance with an exemplary embodiment of the present disclosure, while FIG.4C illustrates how the model developed can be fine tuned and validated.
[000123] As illustrated in FIG. 4A, proposed system can take various telecom related parameters such as customer profile (402), usage data ( 404), payment data ( 406) and device information (408) for each subscriber of the filtered dataset arrived at as described above to create a single view table (410) that can be used as input for developing the model. ( as shown at 412)
[000124] Various data pertaining to telecom related parameters can be as shown in FIG. 4B. For instance, customer profile can include, for each subscriber in the filtered data set, gender, date of birth, address, city and state , and age on network. Similarly, payment data can include bill payment data, bill due date, payment amount, temporary and permanent disconnection, call barring, and payment mode and frequency.
[000125] As shown in FIG. 4C, some portion of the filtered dataset ( illustrated as original data 420) can be used as ‘training data’ 422 for developing/training and validating the model described as shown at blocks 424 and 426. Machine learning and classification algorithm ( as shown at block 428, using logistic regression for instance ) can be used to generate the risk propensity model 430. Validation data 426 can be used to tune and evaluate the machine learning and classification algorithm developed. Finally, testing data 432 can be used to check on performance of model developed.
[000126] FIG. 5 illustrates a method of working of system proposed in accordance with an exemplary embodiment of the present disclosure
[000127] The method comprises, at step 502, generating, at a first computing device, a risk propensity model based upon telecom related parameters of a plurality of subscribers, and step 504, receiving, at a second computing device operatively connected to the first computing device, telecom data pertaining to a subscriber; and at step 506, generating, at a third computing device operatively connected to the first computing device and the second computing device , loan risk propensity of the subscriber by applying the risk propensity model to the telecom data, wherein functionalities of any or a combination of the first computing device, the second computing device and the third computing device are located in a single computing device.
[000128] It can readily be appreciated that any combination of the computing devices described above can be merged. For instance, proposed system can be at central server and can receive therein the telecom related parameters of a plurality of subscribers of a telecom service to generate the risk propensity model. Further, proposed system can receive, from a subscriber, via the subscriber’s smartphone (second computing device) , the subscriber’s telecom data. Both the telecom data and the model can be passed to a third computing device ( that can belong to a loan granter, for example) . The model can be applied to the telecom data and the loan risk propensity of the subscriber generated at the third computing device. All such embodiments and their modifications are fully a part of the present disclosure.
[000129] The proposed method as elaborated above can be described in general context of computer executable instructions. Generally, computer executable instructions can include routines, programs, objects, components, data structures, procedures, modules, functions, etc., that perform particular functions or implement particular abstract data types. The method can also be practiced in a distributed computing environment where functions are performed by remote processing devices that are linked through a communications network. In a distributed computing environment, computer executable instructions may be located in both local and remote computer storage media, including memory storage devices.
[000130] The order in which the method is described is not intended to be construed as a limitation, and any number of the described method blocks/steps can be combined in any order to implement the method or alternate methods. Additionally, individual blocks may be deleted from the method without departing from the spirit and scope of the subject matter described herein. Furthermore, the method can be implemented in any suitable hardware, software, firmware, or combination thereof. However, for ease of explanation, in the embodiments described above, the method may be considered to be implemented in the above described system.
[000131] FIG. 6 illustrates an analysis of performance of system proposed against existing systems in accordance with an exemplary embodiment of the present disclosure.
[000132] As illustrated, system proposed and described herein approved total high end loans (wherein loan value was above a threshold say Rs. 100,000/= ) of 2249 with a default of 9 ( i.e. 0.4 %). Similarly, system proposed approved total of 1699 regular loans (wherein loan value was below the threshold ) with a default of 27 i.e. 1.59 % . On the other hand, a Bureau (for instance a credit rating agency) approved total loans of 9349 with a default of 498, i.e. 5.32 %. Hence, system proposed performed better than the Bureau for the sample data set considered.
[000133] In this manner proposed system provides loan risk propensities/credit ratings for entities/population that have no (or limited) past credit history by using telecom data for the purpose. As is known, telecom penetration is increasing at a very fast pace worldwide. For instance, in India almost 92 % of the population is presently covered with telecoms services. Hence proposed system can provide credit rating coverage for a vast majority of entities. For instance, it can provide credit rating for entities who generally earn and pay in cash ( as in the unorganized labor sector), people who are just entering the job stream, self-employed and similar ‘thin file borrowers’ ( borrowers with no/limited past credit /financial transactions data available)
[000134] Proposed system proposes loans to different entities adapted to their loan risk propensities as generated by it. Amount, tenure and interest of the loan can be varied. This is unlike other systems that only either decline or accept a loan application based upon an entity’s loan risk propensity. Proposed system can build different credit scores/ratings for specific use cases that may be more accurate than one universal credit score/rating. Risk-based pricing of an asset can also be implemented base on loan risk propensity of an entity.
[000135] Ratings/ scores generated by proposed system can be used in conjugation with other parameters (such as a credit bureau score based on past credit history) to provide an improved indication of an entity’s credit worthiness.
[000136] Proposed system reduces the cost of credit underwriting since it eliminates the high costs involved with getting accredit bureau score as is done presently.
[000137] As used herein, and unless the context dictates otherwise, the term “coupled to” is intended to include both direct coupling (in which two elements that are coupled to each other or in contact each other) and indirect coupling (in which at least one additional element is located between the two elements). Therefore, the terms “coupled to” and “coupled with” are used synonymously. Within the context of this document terms “coupled to” and “coupled with” are also used euphemistically to mean “communicatively coupled with” over a network, where two or more devices are able to exchange data with each other over the network, possibly via one or more intermediary device.
[000138] Moreover, in interpreting both the specification and the claims, all terms should be interpreted in the broadest possible manner consistent with the context. In particular, the terms “comprises” and “comprising” should be interpreted as referring to elements, components, or steps in a non-exclusive manner, indicating that the referenced elements, components, or steps may be present, or utilized, or combined with other elements, components, or steps that are not expressly referenced. Where the specification claims refers to at least one of something selected from the group consisting of A, B, C ….and N, the text should be interpreted as requiring only one element from the group, not A plus N, or B plus N, etc.
[000139] While some embodiments of the present disclosure have been illustrated and described, those are completely exemplary in nature. The disclosure is not limited to the embodiments as elaborated herein only and it would be apparent to those skilled in the art that numerous modifications besides those already described are possible without departing from the inventive concepts herein. All such modifications, changes, variations, substitutions, and equivalents are completely within the scope of the present disclosure. The inventive subject matter, therefore, is not to be restricted except in the spirit of the appended claims.
ADVANTAGES OF THE INVENTION
[000140] Present disclosure provides for a system that provides loan risk propensities of entities that have no /insufficient past credit histories.
[000141] Present disclosure provides for a system that adapts to loan risk propensity of an entity to propose loan for the entity accordingly.
[000142] Present disclosure provides for a system that reduces cost of credit underwriting.
| # | Name | Date |
|---|---|---|
| 1 | 201821026351-STATEMENT OF UNDERTAKING (FORM 3) [13-07-2018(online)].pdf | 2018-07-13 |
| 2 | 201821026351-FORM FOR STARTUP [13-07-2018(online)].pdf | 2018-07-13 |
| 3 | 201821026351-FORM FOR SMALL ENTITY(FORM-28) [13-07-2018(online)].pdf | 2018-07-13 |
| 4 | 201821026351-FORM 1 [13-07-2018(online)].pdf | 2018-07-13 |
| 5 | 201821026351-EVIDENCE FOR REGISTRATION UNDER SSI(FORM-28) [13-07-2018(online)].pdf | 2018-07-13 |
| 6 | 201821026351-EVIDENCE FOR REGISTRATION UNDER SSI [13-07-2018(online)].pdf | 2018-07-13 |
| 7 | 201821026351-DRAWINGS [13-07-2018(online)].pdf | 2018-07-13 |
| 8 | 201821026351-DECLARATION OF INVENTORSHIP (FORM 5) [13-07-2018(online)].pdf | 2018-07-13 |
| 9 | 201821026351-COMPLETE SPECIFICATION [13-07-2018(online)].pdf | 2018-07-13 |
| 10 | 201821026351-FORM-9 [16-07-2018(online)].pdf | 2018-07-16 |
| 11 | ABSTRACT1.jpg | 2018-08-12 |
| 12 | 201821026351-Proof of Right (MANDATORY) [27-09-2018(online)].pdf | 2018-09-27 |
| 13 | 201821026351-FORM-26 [27-09-2018(online)].pdf | 2018-09-27 |
| 14 | 201821026351-ORIGINAL UR 6(1A) FORM 1 & FORM 26-280918.pdf | 2019-02-01 |