Abstract: ABSTRACT METHOD AND SYSTEM FOR POST-HOC EXPLANATIONS FOR SESSION-BASED RECOMMENDATIONS Session-based recommendation (SR) approaches have extensively employed deep neural networks (DNN) to provide high-quality recommendations based on a user’s current interactions and item features. However, these approaches are black-box models, providing recommendations that are not understandable to the users and system designers. To further trust and transparency in the recommendation system and extrapolate insights into customer behavior, it is essential to provide explanations for why an item is recommended to a certain user. Most of the conventional approaches does not provide explanations at aggregate level. Hence there is a challenge in generating explanations in aggregate level without compromising accuracy. To overcome the challenges of the conventional approaches, embodiments herein provide a method and system to generate quality explanations in at two levels: (i) Local explanations: explanation of recommended items for the current session, and (ii) Global explanations: explanations for the recommended item at an aggregate level. [To be published with FIG. 2]
Description:TECHNICAL FIELD
The disclosure herein generally relates to the field of e-commerce and, more particularly, to a method and system for post-hoc explanations for session-based recommendations.
BACKGROUND
Recommendation systems (RS) are an integral component of e-commerce, online advertising and streaming applications allowing systems to provide relevant content, boost sales and improve user experience. Session-based recommendation (SR) systems, where the system has to dynamically make recommendations based on current session interactions without any prior user history is another type of RS which is currently used by many e-commerce sources. Generating explanations along with recommendations is essential to build trust, and improve user satisfaction, while assisting system designers to rectify irrelevant recommendations.
Session-based recommendation (SR) approaches have extensively employed deep neural networks (DNN) to provide high-quality recommendations based on a user’s current interactions and item features. In order to make DNNs interpretable (post-hoc), one prominent technique employed is to learn a less complex proxy model to locally mimic and understand a DNNs behavior. However, this requires additional training, and the explanations are not guaranteed to mimic the exact pattern of reasoning in the SR model. Another post-hoc approach generates personalized post-hoc explanations based on item-level causal rules to explain the behaviors of a sequential recommendation model. However, it compromises on recommendation accuracy by constraining the model to rely on the causal rules. Another approach provided explanations on session and item levels by generating a set of scores, i.e., causality and correlation scores. However, it can’t be applied to any other SR approach due to unavailability of causality scores. Another approach generates explanations within a session by considering three factors: sequential patterns, repetition clicks, and item similarities. However, it does not provide explanations at aggregate level. Hence there is a challenge in generating explanations in aggregate level without compromising accuracy.
SUMMARY
Embodiments of the present disclosure present technological improvements as solutions to one or more of the above-mentioned technical problems recognized by the inventors in conventional systems. For example, in one embodiment, a method for post-hoc explanations for session-based recommendations is provided. The method includes receiving, by one or more hardware processors, a recommendation information generated by a recommendation engine in response to an item clicked by a user in an e-commerce environment, wherein the recommendation information comprises a plurality of recommended items, a plurality of learnt item embeddings and, a plurality of learnt session embeddings. Further, the method includes, identifying by one or more hardware processors, a plurality of prior sessions and a plurality of prior session embeddings from among the plurality of learnt session embeddings corresponding to the plurality of recommended items based on an availability of the plurality of recommended items in a plurality of prior sessions. Furthermore, the method includes generating by one or more hardware processors, a plurality of local explaining items based on the plurality of prior session embeddings using a cosine similarity based local explanation generation technique. Furthermore, the method includes simultaneously generating by one or more hardware processors, a plurality of global explaining items based on the plurality of prior session embeddings using a clustering based global explanation generation technique. Finally, the method includes generating by one or more hardware processors, a post-hoc explanation comprising a local explanation and a global explanation based on the plurality of local explaining items and the plurality of global explaining items using an explanation generation tool.
In another aspect, a system for post-hoc explanations for session-based recommendations are provided. The system includes at least one memory storing programmed instructions, one or more Input /Output (I/O) interfaces, and one or more hardware processors operatively coupled to the at least one memory, wherein the one or more hardware processors are configured by the programmed instructions to receive
a recommendation information generated by a recommendation engine in response to an item clicked by a user in an e-commerce environment, wherein the recommendation information comprises a plurality of recommended items, a plurality of learnt item embeddings and, a plurality of learnt session embeddings. Further, the one or more hardware processors are configured by the programmed instructions to identify a plurality of prior sessions and a plurality of prior session embeddings from among the plurality of learnt session embeddings corresponding to the plurality of recommended items based on an availability of the plurality of recommended items in a plurality of prior sessions. Furthermore, the one or more hardware processors are configured by the programmed instructions to generate a plurality of local explaining items based on the plurality of prior session embeddings using a cosine similarity based local explanation generation technique. Furthermore, the one or more hardware processors are configured by the programmed instructions to simultaneously generate plurality of global explaining items based on the plurality of prior session embeddings using a clustering based global explanation generation technique. Finally, the one or more hardware processors are configured by the programmed instructions to generate a post-hoc explanation comprising a local explanation and a global explanation based on the plurality of local explaining items and the plurality of global explaining items using an explanation generation tool.
In yet another aspect, a computer program product including a non-transitory computer-readable medium having embodied therein a computer program for post-hoc explanations for session-based recommendations is provided. The computer readable program, when executed on a computing device, causes the computing device to receive a recommendation information generated by a recommendation engine in response to an item clicked by a user in an e-commerce environment, wherein the recommendation information comprises a plurality of recommended items, a plurality of learnt item embeddings and, a plurality of learnt session embeddings. Further, the computer readable program, when executed on a computing device, causes the computing device to identify a plurality of prior sessions and a plurality of prior session embeddings from among the plurality of learnt session embeddings corresponding to the plurality of recommended items based on an availability of the plurality of recommended items in a plurality of prior sessions. Furthermore, the computer readable program, when executed on a computing device, causes the computing device to generate a plurality of local explaining items based on the plurality of prior session embeddings using a cosine similarity based local explanation generation technique. Furthermore, the computer readable program, when executed on a computing device, causes the computing device to simultaneously generate plurality of global explaining items based on the plurality of prior session embeddings using a clustering based global explanation generation technique. Finally, the computer readable program, when executed on a computing device, causes the computing device to generate a post-hoc explanation comprising a local explanation and a global explanation based on the plurality of local explaining items and the plurality of global explaining items using an explanation generation tool.
It is to be understood that both the foregoing general description and the following detailed description are exemplary and explanatory only and are not restrictive of the invention, as claimed.
BRIEF DESCRIPTION OF THE DRAWINGS
The accompanying drawings, which are incorporated in and constitute a part of this disclosure, illustrate exemplary embodiments and, together with the description, serve to explain the disclosed principles:
FIG. 1 is a functional block diagram of a system for post-hoc explanations for session-based recommendations, in accordance with some embodiments of the present disclosure.
FIG. 2 illustrates a functional architecture of the system of FIG. 1, for post-hoc explanations for session-based recommendations, in accordance with some embodiments of the present disclosure.
FIG. 3 is an exemplary flow diagram illustrating a processor implemented method for post-hoc explanations for session-based recommendations implemented by the system of FIG. 1 according to some embodiments of the present disclosure.
FIG. 4 is an exemplary flow diagram illustrating a method for local explaining items generation implemented by the system of FIG. 1 according to some embodiments of the present disclosure.
FIG. 5 is an exemplary flow diagram illustrating a method for global explaining items generation implemented by the system of FIG. 1 according to some embodiments of the present disclosure.
FIG. 6A through 6E are experimental results illustrating the method for post-hoc explanations for session-based recommendations implemented by the system of FIG. 1 according to some embodiments of the present disclosure.
DETAILED DESCRIPTION OF EMBODIMENTS
Exemplary embodiments are described with reference to the accompanying drawings. In the figures, the left-most digit(s) of a reference number identifies the figure in which the reference number first appears. Wherever convenient, the same reference numbers are used throughout the drawings to refer to the same or like parts. While examples and features of disclosed principles are described herein, modifications, adaptations, and other implementations are possible without departing from the spirit and scope of the disclosed embodiments.
Session-based recommendation (SR) approaches have extensively employed deep neural networks (DNN) to provide high-quality recommendations based on a user’s current interactions and item features. However, these approaches are black-box models, providing recommendations that are not understandable to the users and system designers. To further trust and transparency in the recommendation system and extrapolate insights into customer behavior, it is essential to provide explanations for why an item is recommended to a certain user.
Most of the conventional approaches requires additional training and the explanations are not guaranteed to mimic the exact pattern of reasoning in the SR model. One post-hoc approach generates personalized post-hoc explanations based on item-level causal rules to explain the behaviors of a sequential recommendation model. However, it compromises on recommendation accuracy by constraining the model to rely on the causal rules. Another approach provided explanations on session and item levels by generating a set of scores, i.e., causality and correlation scores. However, it can’t be applied to any other SR approach due to unavailability of causality scores. Another approach generates explanations within a session by considering three factors: sequential patterns, repetition clicks, and item similarities. However, it does not provide explanations at aggregate level. Hence there is a challenge in generating explanations in aggregate level without compromising accuracy.
To overcome the challenges of the conventional approaches, embodiments herein provide a method and system to generate quality explanations in at two levels: (i) Local explanations: explanation of recommended items for the current session, and (ii) Global explanations: explanations for the recommended item at an aggregate level. Local explanations are important for end-users/customers to trust the system, and global explanations are useful for a business user to understand aggregated customer behavior. The generated explaining items along with meta-information, i.e. current session (in case of local explanations) and similar prior sessions (in case of local and global explanation) can be parsed and reasoned over via LLMs to get verbalized explanations that are understandable to the user as well as the system designer.
Referring now to the drawings, and more particularly to FIGS. 1 through 6E, where similar reference characters denote corresponding features consistently throughout the figures, there are shown preferred embodiments and these embodiments are described in the context of the following exemplary system and/or method.
FIG. 1 is a functional block diagram of a system 100 for post-hoc explanations for session-based recommendations, in accordance with some embodiments of the present disclosure. The system 100 includes or is otherwise in communication with hardware processors 102, at least one memory such as a memory 104, an I/O interface 112. The hardware processors 102, memory 104, and the Input /Output (I/O) interface 112 may be coupled by a system bus such as a system bus 108 or a similar mechanism. In an embodiment, the hardware processors 102 can be one or more hardware processors.
The I/O interface 112 may include a variety of software and hardware interfaces, for example, a web interface, a graphical user interface, and the like. The I/O interface 112 may include a variety of software and hardware interfaces, for example, interfaces for peripheral device(s), such as a keyboard, a mouse, an external memory, a printer and the like. Further, the I/O interface 112 may enable the system 100 to communicate with other devices, such as web servers, and external databases.
The I/O interface 112 can facilitate multiple communications within a wide variety of networks and protocol types, including wired networks, for example, local area network (LAN), cable, etc., and wireless networks, such as Wireless LAN (WLAN), cellular, or satellite. For the purpose, the I/O interface 112 may include one or more ports for connecting several computing systems with one another or to another server computer. The I/O interface 112 may include one or more ports for connecting several devices to one another or to another server.
The one or more hardware processors 102 may be implemented as one or more microprocessors, microcomputers, microcontrollers, digital signal processors, central processing units, node machines, logic circuitries, and/or any devices that manipulate signals based on operational instructions. Among other capabilities, the one or more hardware processors 102 is configured to fetch and execute computer-readable instructions stored in the memory 104.
The memory 104 may include any computer-readable medium known in the art including, for example, volatile memory, such as static random access memory (SRAM) and dynamic random access memory (DRAM), and/or non-volatile memory, such as read only memory (ROM), erasable programmable ROM, flash memories, hard disks, optical disks, and magnetic tapes. In an embodiment, the memory 104 includes a plurality of modules 106. The memory 104 also includes a data repository (or repository) 110 for storing data processed, received, and generated by the plurality of modules 106.
The plurality of modules 106 include programs or coded instructions that supplement applications or functions performed by the system 100 for post-hoc explanations for session-based recommendations. The plurality of modules 106, amongst other things, can include routines, programs, objects, components, and data structures, which performs particular tasks or implement particular abstract data types. The plurality of modules 106 may also be used as, signal processor(s), node machine(s), logic circuitries, and/or any other device or component that manipulates signals based on operational instructions. Further, the plurality of modules 106 can be used by hardware, by computer-readable instructions executed by the one or more hardware processors 102, or by a combination thereof. The plurality of modules 106 can include various sub-modules (not shown). The plurality of modules 106 may include computer-readable instructions that supplement applications or functions performed by the system 100 for post-hoc explanations for session-based recommendations. In an embodiment, the modules 106 include a Prior sessions and prior session embeddings identification module (shown in FIG. 2), a local explaining items generation module (shown in FIG. 2), a global explaining items generation module (shown in FIG. 2), and a post-hoc explanation generation module (shown in FIG. 2). In an embodiment, FIG. 2 illustrates a functional architecture of the system of FIG. 1, for post-hoc explanations for session-based recommendations, in accordance with some embodiments of the present disclosure.
The data repository (or repository) 110 may include a plurality of abstracted piece of code for refinement and data that is processed, received, or generated as a result of the execution of the plurality of modules in the module(s) 106.
Although the data repository 110 is shown internal to the system 100, it will be noted that, in alternate embodiments, the data repository 110 can also be implemented external to the system 100, where the data repository 110 may be stored within a database (repository 110) communicatively coupled to the system 100. The data contained within such an external database may be periodically updated. For example, new data may be added into the database (not shown in FIG. 1) and/or existing data may be modified and/or non-useful data may be deleted from the database. In one example, the data may be stored in an external system, such as a Lightweight Directory Access Protocol (LDAP) directory and a Relational Database Management System (RDBMS). The working of the components of the system 100 are explained with reference to the method steps depicted in FIG. 3, FIGS. 6A and FIG. 6B.
FIG. 3 is an exemplary flow diagram illustrating a method 300 for post-hoc explanations for session-based recommendations implemented by the system of FIG. 1 according to some embodiments of the present disclosure. In an embodiment, the system 100 includes one or more data storage devices or the memory 104 operatively coupled to the one or more hardware processor(s) 102 and is configured to store instructions for execution of steps of the method 300 by the one or more hardware processors 102. The steps of the method 300 of the present disclosure will now be explained with reference to the components or blocks of the system 100 as depicted in FIG. 1 and the steps of flow diagram as depicted in FIG. 3. The method 300 may be described in the general context of computer executable instructions. Generally, computer executable instructions can include routines, programs, objects, components, data structures, procedures, modules, functions, etc., that perform particular functions or implement particular abstract data types. The method 300 may also be practiced in a distributed computing environment where functions are performed by remote processing devices that are linked through a communication network. The order in which the method 300 is described is not intended to be construed as a limitation, and any number of the described method blocks can be combined in any order to implement the method 300, or an alternative method. Furthermore, the method 300 can be implemented in any suitable hardware, software, firmware, or combination thereof.
At step 302 of the method 300, the one or more hardware processors 102 are configured by the programmed instructions to receive a recommendation information generated by a recommendation engine in response to an item I_s clicked by a user in a current session s in an e-commerce environment. The recommendation information comprises a plurality of recommended items I_r, a plurality of learnt item embeddings and, a plurality of learnt session embeddings. The plurality of learnt session embeddings includes a plurality of prior session embeddings S_tr and a plurality of current session embeddings S_te.
Let S_tr and S_te be the set of prior (training) sessions and current (testing) sessions, respectively. Considering I to be the set of m items observed in set S_tr . Given any current session s? S_tr , which is a sequence of l item-click events, I_s= {i_(s,1), i_(s,2), . . . , i_(s,l) }, where i_(s,j)?I, the SR model M predicts a recommendation list of top k items, I_r = {i_1, i_2, . . . , i_k }?I. From a trained SR model M, the learned item embedding i_j ? R^d are obtained for each item of I and denote the item embedding set as I. Similarly, learned session embedding s ? R?? are obtained for all prior and current sessions denoted by S_tr and S_te respectively.
At step 304 of the method 300, the prior sessions and prior session embeddings identification module 202 executed by one or more hardware processors 102 is configured by the programmed instructions to identify a plurality of prior sessions and a plurality of prior session embeddings from among the plurality of learnt session embeddings corresponding to the plurality of recommended items based on an availability of the plurality of recommended items in a plurality of prior sessions
At step 306 of the method 300, the local explaining items generation module 204 executed by the one or more hardware processors 102 is configured by the programmed instructions to generate a plurality of local explaining items based on the plurality of prior session embeddings using a cosine similarity based local explanation generation technique. The steps for generating the plurality of local explaining items is explained in conjunction with FIG. 4.
Now referring to FIG. 4, at step 402 of the method 400, the one or more hardware processors 102 is configured by the programmed instructions to select a plurality of local candidate prior sessions from among the plurality of prior session embeddings by computing a cosine similarity value between a plurality of current session embeddings and the plurality of prior session embeddings, wherein a plurality of prior session embeddings with a cosine similarity value greater than a predefined threshold are selected as the plurality of candidate prior sessions
At step 404 of the method 400, the one or more hardware processors 102 is configured by the programmed instructions to obtain a plurality of local candidate items based on the plurality of candidate prior sessions, wherein each of the plurality of items associated with the plurality of candidate prior sessions are obtained.
At step 406 of the method 400, the one or more hardware processors 102 is configured by the programmed instructions to select a plurality of optimum pair-wise similar items by computing a pairwise similarity score between each of the plurality of local candidate items, wherein each pair with a corresponding pair-wise similarity score greater than a predefined threshold are selected.
At step 408 of the method 400, the one or more hardware processors 102 is configured by the programmed instructions to obtain a plurality of relevant items by combining the plurality of optimum pair-wise similar items and a plurality of frequently occurring items, wherein the plurality of frequently occurring items are selected from the plurality of local candidate items based on frequency of occurrence.
At step 410 of the method 400, the one or more hardware processors 102 is configured by the programmed instructions to generate the plurality of local explaining items by computing a cosine similarity between the plurality of relevant items and the items clicked by the user in current session, wherein the plurality of items clicked in current session with maximum similarity with the plurality of relevant items are obtained.
In an embodiment, a pseudocode 1 for generating the plurality of local explaining items is given below:
Pseudocode 1:
Given recommended items I_r, Item clicked history in session s as I_s,, learned item embeddings I, prior and current session embeddings, S_tr and S_te.
for each session s in S_te do
for each recommendation item i_j ? I_r = {i_1, i_2, . . . , i_k } do
S_(?tr)^(?j)={s^'¦i_j???I_(s^' ) },???s^'??S_tr
Candidate prior sessions S_tr^(s,j) =arg?max?_n (cosine(S_tr^j,s))
Candidate Item set I_tr^(s,j) = ? {i' ? I_s'' }, ?s'? S_tr^(s,j)
I_1=Most?frequent?items?across?S_tr^(s,j)?,?I_1??I_tr^(s,j)
I_2={i’¦?max(cosine?(i^',?I_tr^(s,j) )??)=ß},??i^'???I_tr^(s,j)
Relevant items X_tr^(s,j)=I_1?I_2-i_j
Sim^(s,j)=cosine?(I_s,(X_tr^(s,j)??i_j ))?????:concatenation
x_te^(s,j)=arg?max-(i^'???I_s )??Sim^(s,j) ?
end for
Explaining items for session s,X_te^s=Most?frequent?items?from?x_te^(s,j)??i_j???I_r. Here, top-2 frequencies are considered for selecting explaining items in session s .
end for
Now, referring back to FIG. 3, at step 308 of the method 300, the global explaining items generation module 206 executed by the one or more hardware processors 102 is configured by the programmed instructions to simultaneously generate a plurality of global explaining items based on the plurality of prior session embeddings using a clustering based global explanation generation technique. The steps of generating the plurality of global explaining items is explained in conjunction with FIG. 5.
Now referring to FIG. 5, at step 502 of the method 500, the one or more hardware processors 102 is configured by the programmed instructions to generate a plurality of session clusters by clustering the plurality of prior sessions using a density based clustering technique. For example, DB-SCAN (Density-Based Spatial Clustering of Applications with Noise) clustering technique is used for clustering.
At step 504 of the method 500, the one or more hardware processors 102 is configured by the programmed instructions to compute centroid associated with of each of the plurality of session clusters using an averaging technique. For example, the averaging technique used here is a mean computation.
At step 506 of the method 500, the one or more hardware processors 102 is configured by the programmed instructions to obtain a plurality of global candidate prior sessions from among the plurality of prior session embeddings by computing a cosine similarity between the plurality of prior session embeddings and the centroid. The plurality of prior session embeddings with a cosine similarity greater than a predefined cosine similarity threshold are selected as the plurality of global candidate prior sessions. For example, the predefined cosine similarity threshold used here is 0.65
At step 508 of the method 500, the one or more hardware processors 102 is configured by the programmed instructions to obtain a plurality of global candidate items from the plurality of global candidate prior sessions, wherein each of the plurality of items associated with the plurality of candidate prior sessions are obtained.
At step 510 of the method 500, the one or more hardware processors 102 is configured by the programmed instructions to select a plurality of global pair-wise similar items by computing a pairwise similarity score between each of the plurality of global candidate items. Each pair with a corresponding pair-wise similarity score greater than a predefined pair-wise similarity threshold are selected.
At step 512 of the method 500, the one or more hardware processors 102 is configured by the programmed instructions to obtain a plurality of global explaining items by combining the plurality of global pair-wise similar items and the plurality of frequently occurring items. The plurality of frequently occurring items are selected from the plurality of global candidate items.
In an embodiment, a pseudocode 2 for generating the plurality of global explaining items is given below:
Pseudocode 2:
Given recommended items I_r, Item clicked history in session s as I_s,, learned item embeddings I, prior and current session embeddings, S_tr and S_te.
for each item j = 1,2,….,m ? I do
S_tr^j={s^' |i_j ?I_s'},?s'? S_tr
Clusters C= DBSCAN(S_tr^j, ?,min_samples)
for c in clusters C do
Centroid c = mean(S_tr^(c,j)), where S_tr^(c,j)? c
Candidate prior sessions S_tr^(c,j) =arg?max?_n (cosine(S_tr^j,c))
Candidate Item set I_tr^(c,j) = ? {i' ? I_s'' }, ?s'? S_tr^(c,j)
I_1=Most?frequent?items?across?S_tr^(c,j)?,?I_1??I_tr^(c,j)
I_2={i’¦?max(cosine?(i^',?I_tr^(c,j) )??)=ß},??i^'???I_tr^(c,j)
Explaining items X_tr^(c,j)=I_1?I_2-i_j
end for
end for
Now referring back to FIG. 3, at step 310 of the method 300, the post-hoc explanation generationmodule 208 executed by the one or more hardware processors 102 is configured by the programmed instructions to generating a post-hoc explanation comprising a local explanation and a global explanation based on the plurality of local explaining items and the plurality of global explaining items using an explanation generation tool. For example, Generative Pre-trained Transformer (GPT) is used for generating explanations.
Experimentation details:
In an embodiment, the present disclosure is experimented as follows: For example, efficacy of the present disclosure is evaluated on two publicly available datasets, i.e. Diginetica (DN) and Amazon Musical Instruments (AMI). The DN dataset is a large-scale real-world transactional data from CIKM Cup 2016 challenge. The AMI dataset is a public dataset from Amazon, which contains the user-item interactions from May 1996 to Oct 2014 and metadata such as descriptions, category, brand, etc. For DN, the present disclosure filtered out items which have frequency less than 5, followed by removal of sessions of length 1. The sessions from last 1 week was considered as test data. Finally, 0.7M|30,574 sessions are considered for training| testing with average session length 5.12 and 43, 097 items. For AMI, most frequent 10?? users data is considered and items with frequency less than 5 are removed. Here, user’s transactions lying within 20 minutes is considered as a session. In addition, the sessions from last 1 day are used as the test data for AMI. Finally, 18, 128 6, 126 sessions are considered for training testing with average session length 6.45 and 2, 451 items. For both datasets, the remaining data is split chronologically as a training set and validation set for training and model selection purposes respectively. All sessions of length less than 3 are filtered out from testing data for explanation.
The present disclosure considers two evaluation settings: quantitative, and qualitative. For quantitative evaluation, the explaining items generated by the present disclosure are removed from test sessions and the performance of the SR model is observed. The idea is to validate if explaining items are necessary for the SR model to recommend the item that was actually clicked or bought in the original test session. The performance is compared on: i) original test sessions (OTS), ii) by removing any explaining items (-X), iii) by removing any non-explaining items (-NX), iv) by replacing the explainable items with items that are at the highest distance based on cosine similarity (-X+F), v) by removing items based on popularity index (-P), where popularity index of an item is calculated by dividing total sales/clicks of the item by total sales/clicks of all items. The present disclosure has used the standard evaluation metrics Recall@?? and Mean Reciprocal Rank (MRR@??). Recall@?? represents the proportion of test instances which has target item in the top-?? items. MRR@?? is the average of reciprocal ranks of target item in the recommendation list. For qualitative evaluation, input meta-information about explaining items along with metadata, current test sessions, and similar training sessions to GPT-3 are utilized and verbalized explanations that are understandable by business users and system designers are obtained.
For example, the hyperparameter setup for the present disclosure is done as follows: A hold-out validation set is used for model selection using Recall@20 as the performance metric for all experiments in Table 2. The present disclosure uses ?? = 100 and a learning rate of 0.001 with the Adam optimizer. A grid-search is employed over ?? in {0.65, 0.5, 0.4, 0.3, 0.25}. The best parameters on the validation set are ?? = 0.5 and ?? = 0.25 for DN and AMI, respectively. While explaining items at the global level, the present disclosure used ?? = 0.001 and ??????_?????????????? = 4 for |S_tr^j| >= 20 otherwise ??????_?????????????? = 2 are the best on the validation set. For example, ?? = 5 is used for obtaining candidate prior sessions.
Table 1 and 2 show the performance for local explanation and global explanation, respectively. Now referring to Table 1, global explainability evaluation is explained. Test sessions are considered where respective item is recommended. One item is selected from each like long-tail (less popular), mid (moderate popular), head (more popular) randomly for both the datasets. Here, OTS: Original test sessions, X: Explaining items, NX: Non-explaining items, F: Farthest items, P: Popular items. Best results, i.e. percentage drop (% ?) in Recall@20 and MRR@20 are marked in bold and second-best are underlined.
It has been observed from Table 2 that, by removing non-explaining items from test sessions (-NX), Recall@20 and MRR@20 are dropped by 1% and 3% as compared to OTS for DN and AMI, respectively. Slight percentage drops indicate that non-explaining items are irrelevant to recommend the target item. Further, if popular items are removed (-P), it is observed that there are considerable drops i.e., 9% and 22% in Recall@20, and 11% and 35% in MRR@20, indicating popular items are relevant. However, it was observed that significant percentage drops when explaining items are removed (-X) i.e, Recall@20 by 16% and 29% and MRR@20 by 17% and 41% for DN and AMI, respectively. This indicates that explaining items generated by the present disclosure are crucial to recommend the target item. Moreover, it was observed that a further drop in Recall@20 by 64% and 67%, and MRR@20 by 65% and 82% if explaining items are replaced with the least similar items out of all the items (-X+F) due to additional noise. Similarly, it was observed from table 1, significant percentage drops while removing explaining items (-X) in terms of Recall@20 as 100%, 46%, 31% and 33%, 20%, 12% for Long-tail, Mid, Head item for DN and AMI, respectively that is significantly better than removing non-explaining items (-NX) i.e., 0%, 7%, 15% and 0%, 6%, 7%. Also, it is comparable by removing popular items (-P) i.e., 100%, 24%, 38% and 0%, 23%, 14%. Moreover, when explaining items are replaced with the least similar items out of all the items (-X+F), the drop in Recall@20 is as significant as 100%, 72%, 69%, and 33%, 44%, 36%. Similar percentage drops are observed for MRR@20 from table 1.
Table 1
Test session variants DN AMI
Long tail (Item:
17009) Mid(Item: 2125) Head
(Item:94) Long tail (Item:
175) Mid(Item: 39) Head
(Item:102)
Recall@20 MRR@20 Recall@20 MRR@20 Recall@20 MRR@20 Recall@20 MRR@20 Recall@20 MRR@20 Recall@20 MRR@20
OTS 12.50 1.79 41.82 14.45 68.42 20.41 12.00 7.00 30.32 19.41 42.94 30.62
-P 0.00 0.00 31.82 11.30 42.11 7.98 12.00 3.57 23.43 10.45 37.07 20.90
-X 0.00 0.00 22.73 5.90 47.37 15.45 8.00 0.60 24.33 13.63 37.72 22.94
-NX 12.50 0.96 39.09 12.20 57.89 38.16 12.00 4.57 28.53 16.67 40.15 25.50
-X+F 0.00 0.00 11.82 4.43 21.05 7.07 8.00 1.02 16.99 8.78 27.32 12.63
Table 2
Test session variants DN AMI
Recall@20 MRR@20 Recall@20 MRR@20
OTS 44.16 12.55 26.64 16.81
-X 36.90 10.38 18.89 9.87
-NX 43.70 12.42 25.89 16.31
-P 40.26 11.20 20.81 10.91
-X+F 15.92 4.40 8.68 3.07
Qualitative Analysis: Case Study on Amazon Musical Instrument Dataset: For example, Amazon Dataset is used for qualitative analysis due to availability of the meta-information of the items, which is not the case with Diginetica. First, it was studied to identify why item “876: Thomastik-Infeld Accordion Accessory (JF344)”, which belongs to “Bass Guitar Strings” category and “Thomastik-Infeldis” is recommended in a session 16. FIG. 6A shows pair-wise similarity between candidate items set. The pairs with high similarity, i.e. 0.33 and 0.32 are [876, 1771] and [717, 310], respectively. Hence, relevant items based on similarity are ‘1771: Line 6 Relay G50 Wireless Guitar System’, ‘717: Pedaltrain MINI With Soft Case, Instrument Cable; Stage & Studio Cables; Pedaltrain, and ‘310: Fender F Neckplate Chrome’. The relevant items based on frequency are as follows: “45: On-Stage SM7211B Professional Grade Folding Orchestral Sheet Music Stand’, ‘310: Fender F Neckplate Chrome’, ‘422: Classic Series Instrument Cable with Right Angle Plug’, ‘875: Behringer Guitar Link UCG102 Ultimate Guitar-to-USB Audio Interface’. Further, similarity between relevant items and current session items is shown in FIG. 6B. It was observed that explaining items in current session ‘1214: Fender Precision Bass Pickups’ and ‘1631: Electric Guitar Bass Pickguard Screws; Pick Guards; Musiclily’ is close to item 45, item 310 and recommend item 876. This is because they belong to the guitar accessories category. From FIGS 6A and 6B, it was concluded that item 876 is recommended in session 16 because explaining items and relevant items are related to guitar accessories. Similar verbalized explanations are obtained from GPT-3 as shown in FIG. 6D.
Further, it was studied, why an item “102: D’Addario EJ26 Phosphor Bronze Acoustic Guitar Strings, Custom Light, which is of the ‘D’Addario’ brand and belongs to the ‘Acoustic Guitar Strings’ category, get recommended in general. FIG. 6C shows that it is similar to ‘255: PlanetWaves Acoustic Guitar Quick-Release System’, ‘974: Martin M Acoustic Guitar Bridge Pins’, ‘283: Snark SN1 Guitar Tuner’, ‘696: Ernie Ball Earthwood Light Phosphor Bronze Acoustic String Set’ and ‘103: D’Addario Phosphor Bronze Acoustic Guitar Strings, Medium’ with similarity 0.35, 0.35, 0.26 and 0.24 respectively, i.e. all explaining prior items related to guitar accessories and all relevant prior sessions contains same brand ‘D’Addario’ items. The same explanations was seen from GPT-3 as shown in FIG. 6E that item 102 is in the same category (Acoustic Guitar Strings) as the other prior explaining item 103 and it is from the same brand, D’Addario. Additionally, it is a lighter gauge string than the other items, which may be more suitable for some customers.
The written description describes the subject matter herein to enable any person skilled in the art to make and use the embodiments. The scope of the subject matter embodiments is defined by the claims and may include other modifications that occur to those skilled in the art. Such other modifications are intended to be within the scope of the claims if they have similar elements that do not differ from the literal language of the claims or if they include equivalent elements with insubstantial differences from the literal language of the claims.
The embodiments of present disclosure herein address the unresolved problem of generating a post-hoc method to generate explanations that reflect the models’ true behavior at two levels: Local and Global. Further, the present disclosure provides a quantitative evaluation for generated explanations in terms of explaining items using commonly used metrics such as Recall and MRR. Finally, the present disclosure provides verbalized explanations via Large Language Models (LLMs) to improve the readability of explanations.
It is to be understood that the scope of the protection is extended to such a program and in addition to a computer-readable means having a message therein such computer-readable storage means contain program-code means for implementation of one or more steps of the method when the program runs on a server or mobile device or any suitable programmable device. The hardware device can be any kind of device which can be programmed including e.g. any kind of computer like a server or a personal computer, or the like, or any combination thereof. The device may also include means which could be e.g. hardware means like e.g. an application-specific integrated circuit (ASIC), a field-programmable gate array (FPGA), or a combination of hardware and software means, e.g. an ASIC and an FPGA, or at least one microprocessor and at least one memory with software modules located therein. Thus, the means can include both hardware means and software means. The method embodiments described herein could be implemented in hardware and software. The device may also include software means. Alternatively, the embodiments may be implemented on different hardware devices, e.g. using a plurality of CPUs, GPUs and edge computing devices.
The embodiments herein can comprise hardware and software elements. The embodiments that are implemented in software include but are not limited to, firmware, resident software, microcode, etc. The functions performed by various modules described herein may be implemented in other modules or combinations of other modules. For the purposes of this description, a computer-usable or computer readable medium can be any apparatus that can comprise, store, communicate, propagate, or transport the program for use by or in connection with the instruction execution system, apparatus, or device. The illustrated steps are set out to explain the exemplary embodiments shown, and it should be anticipated that ongoing technological development will change the manner in which particular functions are performed. These examples are presented herein for purposes of illustration, and not limitation. Further, the boundaries of the functional building blocks have been arbitrarily defined herein for the convenience of the description. Alternative boundaries can be defined so long as the specified functions and relationships thereof are appropriately performed. Alternatives (including equivalents, extensions, variations, deviations, etc., of those described herein) will be apparent to persons skilled in the relevant art(s) based on the teachings contained herein. Such alternatives fall within the scope and spirit of the disclosed embodiments. Also, the words “comprising,” “having,” “containing,” and “including,” and other similar forms are intended to be equivalent in meaning and be open ended in that an item or items following any one of these words is not meant to be an exhaustive listing of such item or items, or meant to be limited to only the listed item or items. It must also be noted that as used herein and in the appended claims, the singular forms “a,” “an,” and “the” include plural references unless the context clearly dictates otherwise. Furthermore, one or more computer-readable storage media may be utilized in implementing embodiments consistent with the present disclosure. A computer-readable storage medium refers to any type of physical memory on which information or data readable by a processor may be stored. Thus, a computer-readable storage medium may store instructions for execution by one or more processors, including instructions for causing the processor(s) to perform steps or stages consistent with the embodiments described herein. The term “computer-readable medium” should be understood to include tangible items and exclude carrier waves and transient signals, i.e. non-transitory. Examples include random access memory (RAM), read-only memory (ROM), volatile memory, nonvolatile memory, hard drives, CD ROMs, DVDs, flash drives, disks, and any other known physical storage media.
It is intended that the disclosure and examples be considered as
exemplary only, with a true scope of disclosed embodiments being indicated by the following claims. , Claims:WE CLAIM:
1. A processor implemented method (300), the method comprising:
receiving (302), by one or mor hardware processors, a recommendation information generated by a recommendation engine in response to an item clicked by a user in an e-commerce environment, wherein the recommendation information comprises a plurality of recommended items, a plurality of learnt item embeddings and, a plurality of learnt session embeddings;
identifying (304), by the one or mor hardware processors, a plurality of prior sessions and a plurality of prior session embeddings from among the plurality of learnt session embeddings corresponding to the plurality of recommended items based on an availability of the plurality of recommended items in a plurality of prior sessions;
generating (306), by the one or mor hardware processors, a plurality of local explaining items based on the plurality of prior session embeddings using a cosine similarity based local explanation generation technique;
simultaneously generating (308), by the one or mor hardware processors, a plurality of global explaining items based on the plurality of prior session embeddings using a clustering based global explanation generation technique; and
generating (310), by the one or mor hardware processors, a post-hoc explanation comprising a local explanation and a global explanation based on the plurality of local explaining items and the plurality of global explaining items using an explanation generation tool.
2. The method as claimed in claim 1, wherein the steps for generating the plurality of local explaining items based on the plurality of prior session embeddings using the cosine similarity based local explanation generation technique comprises:
selecting a plurality of local candidate prior sessions from among the plurality of prior session embeddings by computing a cosine similarity value between a plurality of current session embeddings and the plurality of prior session embeddings, wherein a plurality of prior session embeddings with a cosine similarity value greater than a predefined threshold are selected as the plurality of candidate prior sessions;
obtaining a plurality of local candidate items based on the plurality of candidate prior sessions, wherein each of the plurality of items associated with the plurality of candidate prior sessions are obtained;
selecting a plurality of optimum pair-wise similar items by computing a pairwise similarity score between each of the plurality of local candidate items, wherein each pair with a corresponding pair-wise similarity score greater than a predefined threshold are selected;
obtaining a plurality of relevant items by combining the plurality of optimum pair-wise similar items and a plurality of frequently occurring items, wherein the plurality of frequently occurring items are selected from the plurality of local candidate items based on frequency of occurrence; and
generating a plurality of local explaining items by computing a cosine similarity between the plurality of relevant items and the items clicked by the user in current session, wherein the plurality of items clicked in current session with maximum similarity with the plurality of relevant items are obtained.
3. The method as claimed in claim 1, wherein the steps of generating the plurality of global explaining items based on the plurality of prior session embeddings using the clustering based global explanation generation technique comprises:
generating a plurality of session clusters by clustering the plurality of prior sessions using a density based clustering technique;
computing centroid associated with of each of the plurality of session clusters using an averaging technique;
obtaining a plurality of global candidate prior sessions from among the plurality of prior session embeddings by computing a cosine similarity between the plurality of prior session embeddings and the centroid, wherein the plurality of prior session embeddings with a cosine similarity greater than a predefined cosine similarity threshold are selected as the plurality of global candidate prior sessions;
obtaining a plurality of global candidate items from the plurality of global candidate prior sessions, wherein each of the plurality of items associated with the plurality of candidate prior sessions are obtained;
selecting a plurality of global pair-wise similar items by computing a pairwise similarity score between each of the plurality of global candidate items, wherein each pair with a corresponding pair-wise similarity score greater than a predefined pair-wise similarity score threshold are selected; and
obtaining a plurality of global explaining items by combining the plurality of global pair-wise similar items and the plurality of frequently occurring items, wherein the plurality of frequently occurring items are selected from the plurality of global candidate items.
4. A system (100) comprising:
at least one memory (104) storing programmed instructions; one or more Input /Output (I/O) interfaces (112); and one or more hardware processors (102) operatively coupled to the at least one memory (104), wherein the one or more hardware processors (102) are configured by the programmed instructions to:
receive a recommendation information generated by a recommendation engine in response to an item clicked by a user in an e-commerce environment, wherein the recommendation information comprises a plurality of recommended items, a plurality of learnt item embeddings and, a plurality of learnt session embeddings;
identify a plurality of prior sessions and a plurality of prior session embeddings from among the plurality of learnt session embeddings corresponding to the plurality of recommended items based on an availability of the plurality of recommended items in a plurality of prior sessions;
generate a plurality of local explaining items based on the plurality of prior session embeddings using a cosine similarity based local explanation generation technique;
simultaneously generate a plurality of global explaining items based on the plurality of prior session embeddings using a clustering based global explanation generation technique; and
generate a post-hoc explanation comprising a local explanation and a global explanation based on the plurality of local explaining items and the plurality of global explaining items using an explanation generation tool.
5. The system of claim 4, wherein the steps for generating the plurality of local explaining items based on the plurality of prior session embeddings using the cosine similarity based local explanation generation technique comprises:
selecting a plurality of local candidate prior sessions from among the plurality of prior session embeddings by computing a cosine similarity value between a plurality of current session embeddings and the plurality of prior session embeddings, wherein a plurality of prior session embeddings with a cosine similarity value greater than a predefined threshold are selected as the plurality of candidate prior sessions;
obtaining a plurality of local candidate items based on the plurality of candidate prior sessions, wherein each of the plurality of items associated with the plurality of candidate prior sessions are obtained;
selecting a plurality of optimum pair-wise similar items by computing a pairwise similarity score between each of the plurality of local candidate items, wherein each pair with a corresponding pair-wise similarity score greater than a predefined threshold are selected;
obtaining a plurality of relevant items by combining the plurality of optimum pair-wise similar items and a plurality of frequently occurring items, wherein the plurality of frequently occurring items are selected from the plurality of local candidate items based on frequency of occurrence; and
generating a plurality of local explaining items by computing a cosine similarity between the plurality of relevant items and the items clicked by the user in current session, wherein the plurality of items clicked in current session with maximum similarity with the plurality of relevant items are obtained.
6. The system of claim 4, wherein the steps of generating the plurality of global explaining items based on the plurality of prior session embeddings using the clustering based global explanation generation technique comprises:
generating a plurality of session clusters by clustering the plurality of prior sessions using a density based clustering technique;
computing centroid associated with of each of the plurality of session clusters using an averaging technique;
obtaining a plurality of global candidate prior sessions from among the plurality of prior session embeddings by computing a cosine similarity between the plurality of prior session embeddings and the centroid, wherein the plurality of prior session embeddings with a cosine similarity greater than a predefined cosine similarity threshold are selected as the plurality of global candidate prior sessions;
obtaining a plurality of global candidate items from the plurality of global candidate prior sessions, wherein each of the plurality of items associated with the plurality of candidate prior sessions are obtained;
selecting a plurality of global pair-wise similar items by computing a pairwise similarity score between each of the plurality of global candidate items, wherein each pair with a corresponding pair-wise similarity score greater than a predefined pair-wise similarity score threshold are selected; and
obtaining a plurality of global explaining items by combining the plurality of global pair-wise similar items and the plurality of frequently occurring items, wherein the plurality of frequently occurring items are selected from the plurality of global candidate items.
Dated this 14th Day of June 2023
Tata Consultancy Services Limited
By their Agent & Attorney
(Adheesh Nargolkar)
of Khaitan & Co
Reg No IN-PA-1086
| # | Name | Date |
|---|---|---|
| 1 | 202321040584-STATEMENT OF UNDERTAKING (FORM 3) [14-06-2023(online)].pdf | 2023-06-14 |
| 2 | 202321040584-REQUEST FOR EXAMINATION (FORM-18) [14-06-2023(online)].pdf | 2023-06-14 |
| 3 | 202321040584-FORM 18 [14-06-2023(online)].pdf | 2023-06-14 |
| 4 | 202321040584-FORM 1 [14-06-2023(online)].pdf | 2023-06-14 |
| 5 | 202321040584-FIGURE OF ABSTRACT [14-06-2023(online)].pdf | 2023-06-14 |
| 6 | 202321040584-DRAWINGS [14-06-2023(online)].pdf | 2023-06-14 |
| 7 | 202321040584-DECLARATION OF INVENTORSHIP (FORM 5) [14-06-2023(online)].pdf | 2023-06-14 |
| 8 | 202321040584-COMPLETE SPECIFICATION [14-06-2023(online)].pdf | 2023-06-14 |
| 9 | 202321040584-FORM-26 [14-08-2023(online)].pdf | 2023-08-14 |
| 10 | Abstract.1.jpg | 2024-01-01 |
| 11 | 202321040584-Proof of Right [05-01-2024(online)].pdf | 2024-01-05 |
| 12 | 202321040584-FORM-26 [05-11-2025(online)].pdf | 2025-11-05 |