DESC:
FORM 2
THE PATENTS ACT, 1970
(39 of 1970)
&
THE PATENT RULES, 2003
COMPLETE SPECIFICATION
(See Section 10 and Rule 13)
METHOD AND SYSTEM FOR PURCHASE BEHAVIOR PREDICTION OF CUSTOMERS
Applicant:
Tata Consultancy Services Limited
A company Incorporated in India under the Companies Act, 1956
Having address:
Nirmal Building, 9th floor,
Nariman point, Mumbai 400021,
Maharashtra, India
The following specification particularly describes the invention and the manner in which it is to be performed.
CROSS-REFERENCE TO RELATED APPLICATIONS AND PRIORITY
[001] The present application claims priority to a Patent Application Serial Number 4550/MUM/2015, filed before Indian Patent Office on 02/Dec/2015 and incorporates that application in its entirety.
TECHNICAL FIELD
[002] The embodiments herein generally relate to data analytics, and, more particularly, to a method and system for predicting purchase behavior of customers by combining temporal and aggregate models.
DESCRIPTION OF THE RELATED ART
[003] Consumer brands often run promotional campaigns and offer discounts or coupons to attract new customers. After such promotional campaigns, it is important to identify the customers who are more likely to make a repeat purchase after the initial incentivized purchase. By focusing on these potential loyal customers in future targeted marketing campaigns, merchants can greatly reduce promotional costs and enhance the return on investment (ROI). This also helps in making pertinent and useful offers to customers. Every retail store has a large number of customers who interact with it. The future purchase behavior of the customers is required to be predicted after giving the customers offers as a part of a promotional campaign, based on the interactions of the customers available with the store.
[004] State-of-the-art systems consider basket-level transaction history to predict the repeat purchase behavior of customers. The basket level information, which actually is aggregate information, involves type of goods purchased by a customer, number of each item purchased, overall purchases made over a period of time and so on. However, the aggregate information may not give a clear picture of purchase pattern of a customer. This is because the aggregate information covers only limited features of a customer behavior, which adversely affects accuracy of any behavior prediction based on the aggregate information.
SUMMARY
[005] Embodiments of the present disclosure present technological improvements as solutions to one or more of the above-mentioned technical problems recognized by the inventors in conventional systems. For example, in one embodiment, a method and a data analytics server for customer behavior assessment are provided. The data analytics server comprising a hardware processor; and a storage medium comprising a plurality of instructions, the plurality of instructions causing the hardware processor to fetch dynamically, a purchase history of at least one customer, by an Input/Output (I/O) interface of the data analytics server, wherein the purchase history comprises of at least one of customer features, product features, and customer-product interaction features. An aggregate model for the purchase history is generated by a data processing module of the data analytics server, wherein the aggregate model comprises of data of a first type, and a temporal model for the purchase history is generated by the data analytics module, wherein the temporal model comprises of data of a second type. Further, a combined model is determined based on the aggregate model and the temporal model, using Mixture of Experts (ME), by a prediction engine of the data analytics server, the prediction engine determines the final prediction score by processing the data of the first type and the data of the second type using ME. Further, the at least one customer is classified as one of a repeat customer and a non-repeating customer, based on the combined model, by the prediction engine.
[006] In another aspect, a method for customer behavior assessment is provided. In this method, a purchase history of at least one customer is fetched dynamically, wherein the purchase history comprises of at least one of customer features, product features, and customer-product interaction features, by a data analytics server. Further, an aggregate model for the purchase history is generated, wherein the aggregate model comprises of data of a first type, by the data analytics server. Further, a temporal model is generated for the purchase history, by the data analytics server, wherein the temporal model comprises of data of a second type. Further, a combined model is determined based on the aggregate model and the temporal model, using Mixture of Experts (ME), by the data analytics server, wherein the ME determines the combined model by processing the data of the first type and the data of the second type. The at least one customer is then classified as one of a repeat customer and a non-repeating customer, based on the combined model, by the data analytics server.
[007] It is to be understood that both the foregoing general description and the following detailed description are exemplary and explanatory only and are not restrictive of the invention, as claimed.
BRIEF DESCRIPTION OF THE DRAWINGS
[008] The embodiments herein will be better understood from the following detailed description with reference to the drawings, in which:
[009] FIG. 1 illustrates a block diagram of a data analytics system, in accordance with an example embodiment;
[0010] FIGS. 2 is a block diagram that depicts components of a data analytics server of the data analytics system, in accordance with an example embodiment;
[0011] FIG. 3 is a flow diagram that depicts steps involved in the process of performing data analytics and prediction using the data analytics system, in accordance with an example embodiment;
[0012] FIG. 4 is a flow diagram that depicts steps involved in the process of categorizing a customer as a repeater or non-repeater, using the data analytics system, in accordance with an example embodiment;
[0013] FIG. 5 is a block diagram of a system for generation of proof explanation in predicting purchase behavior of customers, in an embodiment; and
[0014] FIGS. 6a, 6b, and 6c depict experimental data associated with working of the data analytics system, in accordance with an embodiment.
DETAILED DESCRIPTION
[0015] The embodiments herein and the various features and advantageous details thereof are explained more fully with reference to the non-limiting embodiments that are illustrated in the accompanying drawings and detailed in the following description. The examples used herein are intended merely to facilitate an understanding of ways in which the embodiments herein may be practiced and to further enable those of skill in the art to practice the embodiments herein. Accordingly, the examples should not be construed as limiting the scope of the embodiments herein.
[0016] The disclosed embodiments relate to a mechanism of classifying a customer as a repeater or a non-repeater based on his/her previous interaction with one or more stores, and various offers availed by the customer. A repeater is a customer who ends up making a repeat purchase of one or more products considered, wherein the repeat purchase behavior is characterized in terms of parameters such as but not limited to brand, merchant, shop from where the purchase is being made, and company of the product(s) being purchased. In various embodiments, all relevant information such as but not limited to details of the customers, details of offers and so on are extracted from the transaction data to form a purchase history specific to a customer, and then the purchase behavior of the customer is predicted.
[0017] The embodiments herein provide a system and method to enable customer behavior assessment and in turn predict an expected purchase pattern of the customer. The ‘purchase pattern’ indicates characteristics of the purchases made by the customer over a period of time, with respect to certain pre-defined parameters, and in turn helps to categorize customers as repeaters and non-repeating customers. For example, the disclosed system enables customer behavior prediction based on transaction history by utilizing various aggregate functions. Referring now to the drawings, and more particularly to FIGS. 1 through 6, where similar reference characters denote corresponding features consistently throughout the figures, there are shown preferred embodiments and these embodiments are described in the context of the following exemplary system and/or method.
[0018] FIG. 1 illustrates a network implementation 100 for customer behavior prediction, in accordance with an embodiment of the present subject matter. The network implementation 100 includes a data analytics server 101, and at least one user device 102. The user device 102 can be a laptop 102.a, a desktop computer 102.b, a Personal Digital Assistant (PDA) 102.c, a smartphone 102.n, and/or any such device that is capable of establishing a communication with the data analytics server 101 through at least one suitable channel, at least for the purpose of customer behavior prediction related data and control signal exchange. Further, ‘user devices 102’ can refer to the devices being used by the customers, or one or more devices installed at a service providing center, from which purchase history of one or more customers can be collected for behavior prediction purposes. For example, in an implementation scenario, the user device 102 can refer to a smartphone being used by a customer, details of purchases made by that particular customer can be extracted from that smartphone, by the data analytics server 101. In another implementation scenario, the user device 102 is a data repository located at the service providing center, which possesses information related to purchases made by one or more customers at least over a particular time period. Further, when the data analytics server 101 is deployed in a cloud environment and it needs to collect purchase history information for at least one customer from at least two user devices 102 over a network, the data is associated with a unique identifier assigned to that particular customer, so that the data analytics server 101 can differentiate between data associated with different customers.
[0019] In various embodiments, the data analytics server 101 is placed in a local network and/or is hosted on cloud network or other similar services, and the data analytics server 101 establishes communication with the user devices 102 over a network. Further, the network can be a wireless network, a wired network or a combination thereof. The network can be implemented as one of the different types of networks, such as intranet, local area network (LAN), wide area network (WAN), the internet, and so on. The network may either be a dedicated network or a shared network. The network represents an association of the different types of networks that use a variety of protocols, for example, Hypertext Transfer Protocol (HTTP), Transmission Control Protocol/Internet Protocol (TCP/IP), Wireless Application Protocol (WAP), and so on, to communicate with one another. Further the network may include a variety of network devices, including routers, bridges, servers, computing devices, storage devices, and the like.
[0020] The data analytics server 101 is configured to collect purchase history specific to each customer, and derive, by processing the collected purchase history, a combined model that is built by combining temporal and aggregate models generated based on the purchase history, which in turn is used for a customer behavior prediction. In an embodiment, the temporal and aggregate models may be built based on data pertaining to multiple customers, however, for illustration purpose, the process is explained from single customer perspective, and this is not intended to impose any restriction in terms of scope. The purchase pattern is then used by the data analytics server 101 to classify a customer as a repeat customer or a non-repeating customer. In an embodiment, the data analytics server is configured to derive the purchase pattern based on a combination of temporal and aggregate features extracted from the purchase history. The data analytics server 101 is further configured to use a Mixture of Experts (ME) to combine the temporal and aggregate features so as to generate a combined model, which in turn is used to classify the customer as repeating or non-repeating customer. In an embodiment, ‘repeating customer’ as identified by the data analytics server 101 can be in terms of one or more of parameters such as but not limited to product, brand, offer, and store. In an embodiment, the ME processes temporal and aggregate features together to identify the purchase pattern, though the temporal and aggregate features are data of different types. The data analytics server 102 is further configured to use Long Short Term Memory (LSTM) as classifier over temporal features, and Quantile Regression (QR) as classifier over aggregate features. The temporal features can include any time instance related information with respect to various activities in the purchase history of the customer. For example, time series for the total duration are formed for each of the customer corresponding to different features, like number of different items a customer purchases daily/weekly (or within any time frame), and are used by the temporal model for learning. The time series for each feature consists of values of the feature over a period of time, for example, quantity of product bought for week1, followed by quantity of product bought for week2, and so on. Time series for several such features are considered such that a multivariate time series is formed. There are relations between different features, i.e. the dimensions of the multivariate time series as well as between values for features across time, which is captured in the temporal model, which in turn improves accuracy of the prediction. Repeat fraction of product, which is the ratio of number of customers who have purchased the product more than one time in the past to the number of customers who have purchased the product at least once in the past is also considered. Further, aggregate features can refer to any parameter that is associated with place/goods/location of any purchase as specified in the purchase history collected. For example, types of items purchased, quantity of each item purchased, number of each item purchased, store from which the items were purchased, price of items purchased, location of the stores and so on, over a period of time, are considered as aggregate parameters. Further, the data analytics server 101 is configured to collect the aggregate and temporal features from customer-based features, product-based features, and customer-product interaction based features. Customer-based features capture a customer’s overall purchasing behavior in terms of total visits made, number of distinct products / brands he purchased from, loyalty of the customer i.e. ratio of number of times a customer purchased a product of a particular category, company and brand to the number of times the customer purchased any similar product belonging to same category, total spend, and the like. Product-based features are based on the concept that some offers have more repeaters compared to others, due to various reasons such as marketing strategy, discount given, quality and popularity of product on which offer is made and the like. Further, product based features are related to aspects of the product(s) on which offers are made. Features such as fraction of customers who become repeaters for the offer-product, and similarly, for the offer-product’s brand, company, and the like, after a promotional campaign are considered. Customer-Product interaction based features capture affinity of a customer to the offer-product. Features such as the quantity bought, and amount spent by a customer on the offer-product, and similarly, on the offer-product’s brand, company, and the like are considered.
[0021] FIGS. 2 is a block diagram that depicts components of a data analytics server of the data analytics system, in accordance with an example embodiment. The data analytics server 101 includes an Input/Output (I/O) interface 201, a memory module 202, a data processing module 203, and a prediction engine 204.
[0022] The I/O interface 201 is configured to provide at least one communication channel for the data analytics server 101 to establish communication with at least one user device 102 and exchange at least one type of data associated at least with the purchase behavior prediction. The I/O interface 201 can be configured to support suitable communication protocols, and different modes of communication (for example, wired communication, wireless communication and so on) as required.
[0023] The memory module 202 is configured to store any type of information associated with the purchase pattern identification and associated customer classification as repeating or non-repeating customer, temporarily or permanently, for the purpose of data processing as well as reference purposes, as required. For example, information such as but not limited to purchase history of customer, identified purchase pattern, and classification of the customer. In an embodiment, the data pertaining to each customer is mapped against the unique identification data that represents the customer. The unique identification data can be a number, letters, special characters, or a combination thereof, and is used to uniquely identify each customer and corresponding information.
[0024] The data processing module 203 can be configured to process the collected purchase history of a customer, and generate a combined model corresponding to the collected data. In this process, the data processing module 203 extracts temporal and aggregate features from the collected purchase history. In an embodiment of the present disclosure, the aggregate features (which is data of a first type) and the temporal features (which is data of a second type) are extracted based on features such as but not limited to at least one of total visits made by customers, total amount spent by customers, products purchased, brand of products purchased, loyalty, repeat fraction for each product, repeat fraction for brands, frequency of purchase, and quantity of each product bought, present in the purchase history. The data processing module 203 generates the aggregate model and a corresponding aggregate coefficient by using QR as a classifier over aggregate features. In an embodiment, the aggregate model is a data of a first type. The data processing module 203, by using LSTM as a classifier over temporal features, generates a temporal model and a corresponding temporal coefficient. In an embodiment, the temporal model is a data of a second type. In order to facilitate processing of the temporal model by the ME, the temporal model is processed by the data processing module 203 to extract at least one prediction from the temporal model, which in turn is provided as input to the ME, for processing along with an aggregate model and a plurality of aggregate features. In an embodiment, the at least one prediction from the temporal model can refer to a prediction made with respect to a purchase pattern of the customer, based on the temporal model.
[0025] The data processing module 203 further processes the temporal and aggregate models using the ME, and generates a combined model and a corresponding combined coefficient. In an embodiment, the at least one prediction from the temporal model is processed along with at least one prediction from the aggregate model, and a plurality of aggregate features, by the ME, though they are different types of data. The data processing module 203 is further configured to provide the combined coefficient as input to the prediction engine 204.
[0026] The prediction engine 204 is configured to perform a comparison of the combined coefficient with a threshold value of coefficient, and identify whether the customer is a repeat customer or not. In an embodiment, the threshold value of the coefficient is pre-configured, at the time of initial configuration of the data analytics system 100. In another embodiment, the threshold value of the coefficient is dynamically-configured using at least one suitable provision supported by the data analytics system 100. In an implementation scenario, if the value of combined coefficient is found to be exceeding the threshold value (i.e. a reference threshold), then the customer can be treated as a repeated customer, and if the value of combined coefficient is found to be less than that of the reference threshold, then the prediction engine 204 treats the customer as a non-repeating customer. However, these conditions and value of reference parameters can be changed or reversed as needed, dynamically or statically by an authorized person.
[0027] FIG. 3 is a flow diagram that depicts steps involved in the process of performing data analytics and prediction using the data analytics system, in accordance with an example embodiment. It is to be noted that data from multiple customers may be required to build the temporal and aggregate models. However, Fig. 3 and the description provided herein has explained the data analytics from a single customer perspective for illustration purpose, and is not intended to impose any restriction in terms of the number of customers considered and associated data being collected for the analytics purpose. In order to determine purchase pattern of a customer, the data analytics server 101 collects (302) purchase history of the customer as input. The data analytics server 101 further extracts (304) one or more features from the purchase history, using suitable data processing techniques, wherein the features include at least one aggregate feature and at least one temporal feature.
[0028] Further, the data analytics server 101 builds (306) an aggregate model based on the extracted aggregate feature(s). In an embodiment, the data analytics server 101 uses QR as a classifier over the aggregate feature(s) so as to generate the aggregate model and a corresponding aggregate coefficient. QR based aggregate model utilizes Quantile Regression (QR). Loss function for QR while used as the classifier for the aggregate features is q(y-p) I (y=p) + (1-q) (p-y) I(y