Abstract: ABSTRACT CONTEXT-AWARE TOKEN IMPORTANCE MODEL BASED SYSTEM FOR E-COMMERCE QUERY REFINEMENT AND METHOD THEREOF The present invention relates to a context-aware token importance model-based system for e-commerce query refinement. The system comprises software module (102) installed in an operating device (128), comprising a token importance model (104) configured to evaluate the significance of individual tokens within a user search query. The token importance model (104) incorporates pre-trained word embeddings (106), Named Entity Recognition (NER) entity tags (108), and a fusion mechanism (110) for combining term embeddings with entity embeddings. The system further comprises Named Entity Recognition (NER) Model (112), embedding layer (114), multiple processing layers (116), classification layer (118), threshold evaluation component (120), query refinement module (122), and product retrieval interface (124). The system refines queries using a novel fusion of dynamic token importance scoring and named entity recognition, employing threshold-based token selection that adapts to query context. The system learns from user behavior through training on query chain data from search logs, enabling continuous improvement in query interpretation, refinement, and product retrieval performance. Figure 1
Description:
FIELD OF INVENTION
[0001] The present invention relates to e-commerce search systems. Particularly, the present invention relates to a context-aware token importance model-based system and method for refining e-commerce search queries and improving product retrieval. More particularly, the present invention relates to a system and method that enhances search accuracy and efficiency by intelligently refining user queries through a novel fusion of dynamic token importance scoring and named entity recognition, thereby adapting to query context, significantly improving product retrieval relevance, reducing search time, and enhancing user experience in e-commerce environments. The unique combination of neural network-based token importance modeling and entity recognition enables more precise interpretation of user intent, handling of complex queries, and dynamic adaptation to evolving e-commerce search.
BACKGROUND OF THE INVENTION
[0002] E-commerce platforms have become increasingly popular for consumers to search for and purchase products online. These platforms typically provide search functionality that allows users to enter queries to find relevant items in the product catalog. As e-commerce catalogs have grown to include millions of products across diverse categories, delivering accurate and relevant search results has become more challenging.
[0003] Traditional e-commerce search engines often rely on keyword matching between user queries and product metadata such as titles, descriptions, and attributes. While this approach can work well for simple queries, it faces difficulties with more complex, natural language queries that users commonly enter. Long, detailed queries may contain extraneous words or phrases that are not actually relevant for finding the desired products. At the same time, important contextual information in the query may be overlooked.
[0004] Another issue is that the same words can have different levels of importance depending on the context of the query and the types of products being searched for. For example, the word "red" may be crucial when searching for a specific color of clothing, but less relevant when searching for electronics. Existing search systems often lack the ability to dynamically assess token importance based on query context.
[0005] Additionally, named entities like brands, product lines, or specific models are frequently included in e-commerce queries but may not be properly recognized and leveraged by basic keyword matching approaches. This can lead to irrelevant results when entity names happen to match other product attributes.
[0006] The limitations of keyword-based search can result in poor relevance, with irrelevant products being surfaced while more suitable matches are missed. Users may need to reformulate queries multiple times or scroll through many pages of results to find what they are looking for. This creates friction in the shopping experience and can lead to abandoned searches and lost sales opportunities for e-commerce businesses.
[0007] Furthermore, as e-commerce catalogs continue to expand, the computational resources required to process and match queries against millions of products in real-time become increasingly demanding. There is a need for more efficient query processing that can deliver fast, relevant results on a scale.
[0008] The performance and efficiency of search algorithms also present ongoing challenges. As product catalogs expand and user traffic increases, search systems must be able to process queries and return results quickly, even under high load. Optimizing query processing to reduce latency while maintaining or improving result quality is an area of active research and development in the e-commerce industry.
[0009] Moreover, personalization and adaptation to user intent pose difficulties for current search implementations. Users may have varying preferences or implicit intents behind similar queries, which are not always captured by keyword-based approaches. Developing more nuanced and context-aware query interpretation could lead to improved search experiences tailored to individual users.
[0010] Furthermore, users on an E-commerce platform often tend to be excessively articulate when expressing their shopping intent. For example, a user query like 'microfibre mop for bathroom tiles with 5-year warranty' is an example of an elaborate search query. Many terms in such user queries are redundant from a product retrieval perspective. An optimized version of the above query would be simply 'microfibre mop'. Such overly articulate queries not only challenge the relevance of the retrieved products but also affect the performance of the systems and add latency
[0011] There are several patent applications that relate to E-commerce search and query processing systems. One such United States Patent Application, US9600529B2, discloses an attribute-based document searching system that generates scores for attributes of products, including a token. The scores are based on the frequency of occurrence of the token in the attribute value and the length of the attribute value. While this approach aims to improve relevance, it does not adequately account for the contextual importance of tokens or handling complex natural language queries. The system lacks the ability to dynamically assess token importance based on query context and user intent.
[0012] Another United States Patent Application, US20230351184A1, describes a query classification system using sparse soft labels. It generates queries based on extracted features and documents and selects a subset of queries based on precision scores. However, this method focuses primarily on query classification rather than token-level importance. It does not provide a mechanism for dynamically weighting individual tokens based on their contextual relevance within a query.
[0013] Yet another United States Patent Application, US11709844B2, related to a computerized smart inventory search system using classification and tagging. The system maps search terms to domain objects and uses machine learning for classification. While this approach incorporates some context, it does not specifically address the varying importance of individual tokens within complex queries. The system may struggle with queries containing extraneous or less relevant terms.
[0014] Yet another United States Patent Application, US20220179895A1, discloses a method for classification and tagging of textual data using automatically learned queries. It generates queries based on extracted features and assigns labels to textual portions. However, this approach does not provide a sophisticated mechanism for assessing the relative importance of different tokens within a single query. It may not effectively handle queries with varying levels of token relevance.
[0015] Yet another United States Patent Application, US20230281257A1, describes a system for determining search token importance using machine learning. While this system does attempt to assess token importance, it relies heavily on historical engagement data and may not adapt well to new or evolving query patterns. Additionally, it does not incorporate named entity recognition to enhance understanding of query context.
[0016] These prior art systems, while each addressing certain aspects of e-commerce search, have several limitations. They generally lack the ability to dynamically assess token importance in real-time based on the full query context. Most do not effectively combine token importance scoring with named entity recognition to gain a more comprehensive understanding of user intent. Further, these systems may struggle with handling complex, natural language queries that are becoming increasingly common in e-commerce search.
[0017] Keeping in view the challenges associated with the above state of art, there is a need for a context-aware token importance model-based system and method for refining e-commerce search queries that can dynamically assess token importance, recognize named entities, and adapt to query context, thereby improving search accuracy, efficiency, and user experience in e-commerce environments.
SUMMARY OF THE INVENTION
[0018] The present invention relates to a context-aware token importance model-based system for e-commerce query refinement and a method for operation thereof.The invention provides a system comprising a software module installed in an operating device, where the software module includes a token importance model with pre-trained word embeddings, Named Entity Recognition (NER) entity tags, and a fusion mechanism for combining term embeddings with entity embeddings. The system further incorporates a Named Entity Recognition (NER) Model, an embedding layer, multiple processing layers, a classification layer, a threshold evaluation component, a query refinement module, and a product retrieval interface. This comprehensive system works in conjunction with an e-commerce product database to process, refine, and optimize user search queries for improved product retrieval in e-commerce environments.
[0019] The system significantly enhances search relevance by identifying and removing redundant or non-essential terms from user queries, thereby improving the accuracy of product retrieval and reducing the occurrence of null search results. Query processing efficiency is enhanced through the elimination of redundant terms, reducing computational load on the search engine. The context-aware nature of the query refinement process allows for more intelligent interpretation of user intent, adapting to the specific context of each query. By training on query chain data from search logs, the system continually adapts to user behavior, enabling ongoing performance improvements. Users benefit from superior experience with more relevant search results and fewer null searches, potentially leading to increased engagement and higher conversion rates. The system's architecture, based on transformer models and entity recognition, allows for scalability across various types of e-commerce queries and product categories, while also effectively handling long tail and complex queries that can challenge traditional search systems. This automated approach minimizes the need for manual curation of search terms or maintenance of keyword lists. Furthermore, the system is designed for seamless integration with existing e-commerce search infrastructures, complementing and enhancing current search capabilities without requiring a complete overhaul.
OBJECTIVE OF THE INVENTION
[0020] The primary objective of the present invention is to provide a context-aware token importance model-based system for e-commerce query refinement and a method for operation thereof.
[0021] Another objective of the present invention is to enhance search accuracy and efficiency in e-commerce environments by intelligently refining user queries through a novel fusion of dynamic token importance scoring and named entity recognition.
[0022] Another objective of the present invention is to improve product retrieval relevance in e-commerce search systems by adapting to query context and more precisely interpreting user intent.
[0023] Another objective of the present invention is to reduce search time and enhance user experience in e-commerce platforms by effectively handling complex, natural language queries.
[0024] Another objective of the present invention is to enable more efficient query processing that can deliver fast, relevant results at scale as e-commerce catalogs continue to expand.
[0025] Another objective of the present invention is to provide a system and method that can dynamically assess token importance in real-time based on the full query context, overcoming limitations of traditional keyword-based search approaches.
[0026] Another objective of the present invention is to combine token importance scoring with named entity recognition to gain a more comprehensive understanding of user intent in e-commerce search queries.
[0027] Another objective of the present invention is to address the challenge of varying token importance based on query context and product types being searched for in e-commerce environments.
[0028] Yet another objective of the present invention is to improve the recognition and leveraging of named entities like brands, product lines, or specific models frequently included in e-commerce queries.
[0029] Yet another objective of the present invention is to reduce the need for users to reformulate queries multiple times or scroll through many pages of results to find desired products in e-commerce search.
BRIEF DESCRIPTION OF DRAWINGS
[0030] FIG. 1 illustrates a block diagram of a context-aware token importance model-based system token importance model system, according to aspects of the present disclosure.
DETAILED DESCRIPION OF THE INVENTION
[0031] Accordingly, the present invention relates to the field of e-commerce search systems and methods for improving query processing and product retrieval. More specifically, the invention pertains to a system and method for refining user search queries in e-commerce platforms by identifying and removing redundant or non-essential terms while preserving the core intent of the query. The invention utilizes advanced natural language processing techniques and machine learning models to analyze and optimize search queries, thereby enhancing the relevance and efficiency of product retrieval in e-commerce search engines.
[0032] In an embodiment, the system comprises several interconnected components as described below:
[0033] (A) Software module (102): A software module (102) is installed in an operating device (128) for refining e-commerce search queries provided by the user and improving product retrieval. The software module (102) helps enhance search accuracy and efficiency by intelligently processing and optimizing user queries. The software module (102) further comprises the following sub-modules:
(i) A token importance model (104): The token importance model (104) is a transformer-based model that evaluates the significance of individual tokens, such as words or phrases within a user search query. The token importance model (104) comprises:
a. Pre-trained word embeddings (106): Pre-trained word embeddings (106) are vector representations of words that capture semantic relationships and meanings. These embeddings are generated by training large language models on vast corpora of text data. The pre-trained word embeddings (106) in the present invention provide initial semantic representations for input tokens, allowing the token importance model (104) to start with a rich understanding of word meanings and relationships. This pre-existence helps the token importance model (104) evaluate the significance of individual tokens within a user search query more effectively, even when encountering unfamiliar or complex terms in e-commerce search queries.
b. Named Entity Recognition (NER) entity tags (108): The token importance model (104) utilizes NER tags (108) to enhance understanding of token types.
c. Fusion mechanism (110): The fusion mechanism (110) combines term embeddings with entity embeddings by adding the vectors and passing them through a neural network layer. This process creates a comprehensive token representation that captures both semantic meaning and entity information, allowing for more nuanced interpretation of each token's role and importance within the query context.
(ii) A Named Entity Recognition (NER) Model (112): NER Model (112) is a natural language processing technique that identifies and classifies named entities such as, but not limited to, product names within text. The token importance model (104) utilizes NER tags (108) to enhance understanding of token types, thereby generating entity embeddings for input tokens. By recognizing and categorizing specific entities within a query, the NER Model (112) can efficiently provide entity information to the token importance model (104). This is particularly valuable in e-commerce contexts where brand names, product categories, or specific model numbers may be crucial to understanding the query. The NER model outputs entities for tokens such as, but not limited to, brand, gender, and the like. The embeddings of these output entities, such as, but not limited to, brand, gender, and the like, are initialized using Fasttext embedding (ai), where, for instance, a1 represents the embedding for Brand, a2 represents the embedding for Gender, and so on for other entity types. These initialized embeddings are then fine-tuned during the training process. NER model is trained beforehand and is not updated during training of the Token Importance Model. NER model's aim is to generate entities in word format (brand, gender, category) etc. The NER Model (112) generates the following entity embeddings for input tokens: :
(a) A brand embedding a1: It represents brand-related information for relevant tokens.
(b) A gender embedding a2: It represents gender-related information for applicable tokens.
(c) A category embedding a3: It encodes category-related information for applicable tokens.
(d) Other embeddings ai: They capture miscellaneous entity information not covered by other categories.
(e) A null embedding a0: It represents tokens that do not correspond to any specific named entity.
The NER Model (112) outputs these entity embeddings, which are then combined with the word embeddings in the fusion mechanism (110) of the token importance model (104) to create comprehensive token representations. This combined representation allows the token importance model (104) to consider both semantic and entity-specific information when assessing the importance of each token in the query.
(iii) An Embedding Layer (114): The embedding layer (114) processes input tokens to generate initial word embeddings. The embedding layer (114) takes the raw text of a user's search query and converts each word or subword into a dense vector representation. These vectors are typically high-dimensional (e.g., 100-300 dimensions) and capture semantic relationships between words.
(iv) Multiple Processing Layers (116): A series of transformer/processing layers (e.g., Layer 1, Layer 2, Layer 3) (116) progressively refine token representations, capturing complex relationships and contextual information within the query. Each layer consists of self-attention mechanisms and feed-forward neural networks. As the token representations pass through these layers (116), they are progressively refined. The self-attention mechanism allows each token to attend to all other tokens in the query, capturing complex relationships between words. This process enables the model to understand context-dependent meanings, long-range dependencies, and subtle semantic nuances within the query.
(v) Classification Layer (118): The classification layer (118) assigns important scores to each token based on the processed representations, enabling the system to identify which tokens are most relevant to the user's search intent.
(vi) A Threshold Evaluation Component (120): The threshold evaluation component (120) compares token importance scores against a predefined threshold to determine which tokens to retain or remove, effectively refining the original query by eliminating less relevant terms. This component acts as a filter, using the important scores from the classification layer. The predefined threshold is initially set based on empirical analysis of query performance data, but it is dynamically adjustable. The system employs a feedback loop that monitors query refinement outcomes and adjusts the threshold accordingly to optimize performance across different query types and product categories. Tokens scoring above the threshold are kept in the refined query, while those below are removed. This process helps streamline the query by keeping only the most relevant terms.
(vii) A Query Refinement Module (122): Query refinement module (122) takes the output from the Threshold Evaluation Component (120) and constructs a refined query, optimizing it for improved product retrieval. The query refinement module (122) constructs the refined query by ordering the retained tokens based on their importance scores and the original query structure. It may also incorporate additional relevant terms from the product catalog that are semantically related to the high-scoring tokens, further enhancing the query's effectiveness in retrieving relevant products. Thus, the result is a streamlined, optimized version of the original user query.
(viii) A Product Retrieval Interface (124): The product retrieval interface (124) interfaces with the e-commerce platform's product database, using the refined query to fetch and rank relevant products. It ensures that the improvements made by the token importance model (104) and query refinement process translate into better, more relevant product recommendations for the user.
[0034] (B) E-commerce product database (126): The e-commerce product database (126) is directly connected to the product retrieval interface (!24). It serves as the primary source of product information for the search system. When the product retrieval interface (124) receives a refined query from the query refinement module (122), it queries this database (126) to fetch and rank relevant products.
[0035] (C) Operating Device (128): The operating device (128) hosts and executes the software module (102) for e-commerce query refinement and product retrieval. In an exemplary embodiment, the operating device (128) may be selected from the group comprising such as, but not limited to, a server, computer, laptop, or other suitable computing hardware capable of running complex algorithms and processing large amounts of data. The operating device (128) may be accessed by e-commerce platform administrators or search engineers for configuration and maintenance and can be integrated with existing e-commerce infrastructure to directly enhance the search functionality for end-users (online shoppers). The operating device (128) plays a crucial role in enabling real-time query processing, allowing for swift refinement of user queries and improved product retrieval in e-commerce environments.
[0036] In an exemplary embodiment, the working of the system involves a user entering a search query by a user, for example, "comfortable red Nike running shoes for women" into the e-commerce platform's search bar. The token importance model (104) receives this query as input. First, the embedding layer (114) converts each word into a vector representation, capturing semantic meanings. For example, "Nike" might be represented as [0.2, -0.5, 0.8, ...], while "running" could be [0.6, 0.3, -0.1, ...]. Thereafter, the Named Entity Recognition (NER) model (112) identifies "Nike" as a brand entity and "running shoes" as a product category entity. It generates corresponding entity embeddings, enhancing the token representations. The multiple processing layers (116) then analyze these embeddings, capturing relationships between words. For instance, they might recognize that "comfortable" and "red" are attributes specifically related to "running shoes".The classification layer (118) then assigns importance scores to each token. In this example, "Nike" might receive a score of 0.9, "running shoes" 0.85, "comfortable" 0.7, "red" 0.6, "women" 0.75, while "for" receives a low score of 0.1. The threshold evaluation component (120), with a hypothetical threshold of 0.5, would then retain "Nike", "running shoes", "comfortable", "red", and "women", while removing "for". The query refinement module (122) constructs a new query using these retained tokens, potentially reordering them based on importance: "Nike running shoes women comfortable red". Finally, the product retrieval interface (124) uses this refined query to search the product database (126). It may prioritize matching Nike brand running shoes designed for women, then filter or rank results based on comfort features and red color options. This refined search is more likely to return relevant products, improving the user's shopping experience by showing them precisely what they're looking for, without irrelevant results that might have been caused by less important terms in the original query.
[0037] In an exemplary embodiment, the system may contribute to improvements in various e-commerce metrics. These improvements may include such as, but not limited to, reductions in null search rates, increases in conversion rates, and enhancements in overall user engagement with the e-commerce platform.
[0038] In an exemplary embodiment, as shown in FIG. 1, the system processes input tokens through multiple layers to determine their importance in a search query. The token importance model (104) receives input tokens, such as "apple," "company," and "laptop," and passes them through an embedding layer (114). This embedding layer (114) generates initial word embeddings for each input token. The system incorporates a Named Entity Recognition (NER) Model (112) that generates entity embeddings for the input tokens. The NER Model (112) outputs three distinct embeddings: the brand embedding e1, the other embedding e2, and the category embedding e3. These entity embeddings are combined with the processed token information to provide a comprehensive representation of each token. The token importance model (104) utilizes a 3-layer encoder-only transformer architecture. The processed embeddings pass through multiple processing layers (116), including Layer 1, Layer 2, and Layer 3. These layers (116) progressively refine the token representations, capturing complex relationships and contextual information. After the encoder layers, the token importance model (104) includes a classification layer (118). This classification layer (118) assigns importance scores to each input token based on the processed representations. In the example shown in FIG. 1, the classification layer (118) outputs numerical values as importance scores: 0.87 for "apple," 0.12 for "company," and 0.96 for "laptop." The system uses a predefined threshold to filter out redundant terms based on the assigned importance scores. Tokens with scores below the threshold are considered redundant and are dropped from the query. In FIG. 1, the token "company" is marked in red, indicating that its score (0.12) falls below the threshold and is therefore removed. Conversely, tokens "apple" and "laptop" are marked in green, signifying that their scores (0.87 and 0.96, respectively) exceed the threshold and are retained in the refined query. By combining token-level processing, entity recognition, and importance scoring, the token importance model (104) enables context-aware processing of search terms. This approach allows for effective query refinement by identifying and removing redundant or non-essential terms while preserving the core intent of the user's search query.
[0039] In an embodiment, token importance model (104) may be trained using query chain data from search logs. This training approach allows the system to learn from real-world user behavior and query patterns, enhancing its ability to identify important tokens in e-commerce search queries.
[0040] The token importance model (104) may contribute to improvements in various e-commerce metrics. These improvements may include reductions in null search rates, increases in conversion rates, and enhancements in overall user engagement with the e-commerce platform.
[0041] In an embodiment, the system is designed to continuously learn and improve its performance through the analysis of query chain data from search logs. It tracks sequences of queries within user sessions, identifying patterns in query reformulation and the eventual queries that lead to successful product discoveries. This data is used to fine-tune the token importance model (104), adjust thresholds, and improve the query refinement process, enabling the system to adapt to evolving user behavior and search patterns over time.
[0042] In an embodiment, the system's architecture is specifically designed to handle complex, natural language queries that are becoming increasingly common in e-commerce search. Its deep processing layers and context-aware token importance assessment allow it to interpret nuanced query intent, even in lengthy or ambiguous searches. Furthermore, the system's continuous learning capabilities enable it to adapt to evolving search patterns and emerging e-commerce trends. As new product categories, brands, or search behaviors emerge, the system can dynamically adjust its token importance assessments and query refinement strategies to maintain high-quality search results across diverse and evolving e-commerce scenarios.
[0043] In an embodiment, the present invetion also relates to a method for refining e-commerce search queries using the system of the present invention. The method comprises the following steps:
(a) receiving a search query provided by a user into a search field/input field of an e-commerce platform's user interface,
(b) generating initial word embeddings for one or more tokens in the search query using an embedding layer (114) that converts each word or sub-word token into a dense vector representation;
(c) obtaining entity embeddings for one or more tokens using a Named Entity Recognition (NER) model (112) that identifies and classifies named entities;
(d) combining the word embeddings and entity embeddings to create comprehensive token representations, fusing term embeddings with entity embeddings for a richer token representation;
(e) (e) processing the comprehensive token representations obtained in step (d) through multiple layers (116) of a Token Importance Model (104), specifically a series of transformer layers (Layer 1, Layer 2, Layer 3) (116) that progressively refine token representations, capturing complex relationships and contextual information within the query;
(f) assigning importance scores to each token based on the processed representations using a classification layer (118), which evaluates the relevance of each token to the user's search intent;
(g) comparing the importance scores to a predefined threshold using a threshold evaluation component (120);
(h) removing tokens with importance scores below the predefined threshold, effectively eliminating less relevant terms from the original query, and retaining tokens with importance scores above the predefined threshold, preserving the most relevant terms for the search;
(i) generating a refined search query using the retained tokens through a query refinement module (122), which constructs an optimized query based on the important tokens;
(j) mapping the retained tokens to catalog key-value pairs for query understanding, associating query terms with specific attributes in the product catalog;
(k) retrieving products based on the refined search query and mapped catalog key-value pairs using a product retrieval interface (124), which interfaces with the e-commerce platform's product database (126) to fetch and rank relevant products; and
(l) outputting the retrieved and ranked products to the user, presenting the most relevant results based on the refined and optimized search query.
[0044] In an exemplary embodiment, the search query includes one or more tokens wherein the tokens are individual words, sub-words, or phrases that represent discrete units of meaning within the query. For example, in the query "red Nike running shoes", the tokens will be "red", "Nike", "running", and "shoes".
[0045] In another exemplary embodiment, the named entities may be such as, but not limited to, brand names, product categories, or specific model numbers.
[0046] In some implementations, the method may include additional steps such as:
• analyzing the search query for subjective terms or descriptive language
• identifying brand names, categories, and other relevant attributes within the query
• adjusting importance scores based on contextual information
• applying variable thresholds for different types of tokens or query contexts
[0047] The method for refining e-commerce search queries may be integrated with existing e-commerce platforms to enhance search functionality and product retrieval accuracy.
[0048] The system offers several advantages over conventional search methods:
1. Improved search relevance: By identifying and removing redundant or non-essential terms from user queries, the system enhances the accuracy of product retrieval. This results in more relevant search results for users, increasing their satisfaction and likelihood of finding desired products.
2. Reduced null search results: The refined queries generated by the system are more likely to match available products in the catalog, thereby decreasing the occurrence of null search results. This improvement helps users find products even when their initial queries may be overly specific or contain unnecessary terms.
3. Enhanced query processing efficiency: By eliminating redundant terms, the system reduces the computational load on the search engine. This optimization can lead to faster query processing and improved overall system performance.
4. Context-aware query refinement: The system's ability to consider the context of terms within a query allows for more intelligent refinement. For example, distinguishing between subjective adjectives and brand names ensures that important terms are not inadvertently removed.
5. Adaptability to user behavior: Training the model on query chain data from search logs enables the system to learn from and adapt to real-world user search patterns. This adaptability allows the system to continually improve its performance over time.
6. Improved user experience: By presenting more relevant search results and reducing null searches, the system enhances the overall user experience on the e-commerce platform. This improvement can lead to increased user engagement, higher conversion rates, and improved customer satisfaction.
7. Scalability: The system's architecture, based on transformer models and entity recognition, allows for scalability across various types of e-commerce queries and product categories.
8. Reduced manual intervention: The automated nature of the query refinement process minimizes the need for manual curation of search terms or maintenance of keyword lists, saving time and resources for the e-commerce platform.
9. Support for long-tail queries: The system's ability to handle complex and specific queries makes it particularly effective for processing long-tail searches, which can be challenging for traditional search systems.
10. Integration with existing systems: The token importance model can be integrated into existing e-commerce search infrastructures, complementing and enhancing current search capabilities without requiring a complete overhaul of the system.
, Claims:WE CLAIM:
1. A context-aware token importance model-based system for e-commerce query refinement, comprising:
• a software module (102) installed in an operating device (128), comprising:
o a token importance model (104) configured to evaluate the significance of individual tokens within a user search query, comprising:
pre-trained word embeddings (106) providing initial semantic representations for input tokens;
Named Entity Recognition (NER) entity tags (108) configured to enhance understanding of token types, enabling recognition and categorization of specific entities within a query;
a fusion mechanism (110) for combining term embeddings with entity embeddings to create a comprehensive token representation;
• a Named Entity Recognition (NER) Model (112) configured to identify and classify named entities in the query, generating entity embeddings for input tokens;
• an embedding layer (114) configured to process input tokens to generate initial word embeddings, converting words into dense vector representations;
• multiple processing layers (116) configured to progressively refine token representations, capturing complex relationships and contextual information within the query;
• a classification layer (118) configured to assign importance scores to each token based on the processed representations, enabling identification of the most relevant tokens to the user's search intent;
• a threshold evaluation component (120) configured to compare token importance scores against a predefined threshold to determine which tokens to retain or remove, effectively refining the original query;
• a query refinement module (122) configured to construct a refined query using the retained tokens, optimizing it for improved product retrieval; and
• a product retrieval interface (124) configured to interface with an e-commerce platform's product database (126), using the refined query to fetch and rank relevant products;
• an operating device (128) for hosting and executing the software module (102); and
• the e-commerce product database (126) directly connected to the product retrieval interface (124), serving as a primary source of product information for the system;
wherein,
(I) the system is configured to refine queries using a novel fusion of dynamic token importance scoring and named entity recognition, employing threshold-based token selection that adapts to query context;
(II) The system learns from user behavior through training on query chain data from search logs, enabling continuous improvement in query interpretation, refinement, and product retrieval performance; and
(III) This system is configured to precisely handle complex, natural language queries and dynamic adaptation to evolving search patterns across diverse e-commerce scenarios.
2. The system as claimed in claim 1, wherein the token importance model (104) is a transformer-based model that evaluates the significance of individual tokens within a user search query.
3. The system of claim 1, wherein the NER Model (112) generates entity embeddings for input tokens, including brand embedding, category embedding, or other embedding.
4. The system as claimed in claim 1, wherein the operating device (128) is selected from a group comprising a server, computer, and laptop.
5. A method for refining e-commerce search queries using the system as claimed in claim 1, comprising steps of:
a) receiving a search query provided by a user into a search field/input field of an e-commerce platform's user interface,
(b) generating initial word embeddings for one or more tokens in the search query using an embedding layer (114) that converts each word or sub-word token into a dense vector representation;
(c) obtaining entity embeddings for one or more tokens using a Named Entity Recognition (NER) model (112) that identifies and classifies named entities;
(d) combining the word embeddings and entity embeddings to create comprehensive token representations, fusing term embeddings with entity embeddings for a richer token representation;
(e) processing the comprehensive token representations obtained in step (d) through multiple layers of a Token Importance Model (104), specifically a series of transformer/processing layers (Layer 1, Layer 2, Layer 3) (116) that progressively refine token representations, capturing complex relationships and contextual information within the query;
(f) assigning importance scores to each token based on the processed representations using a classification layer (118), which evaluates the relevance of each token to the user's search intent;
(g) comparing the importance scores to a predefined threshold using a threshold evaluation component (120);
(h) removing tokens with importance scores below the predefined threshold, effectively eliminating less relevant terms from the original query, and retaining tokens with importance scores above the predefined threshold, preserving the most relevant terms for the search;
(i) generating a refined search query using the retained tokens through a query refinement module (122), which constructs an optimized query based on the important tokens;
(j) mapping the retained tokens to catalog key-value pairs for query understanding, associating query terms with specific attributes in the product catalog;
(k) retrieving products based on the refined search query and mapped catalog key-value pairs using a product retrieval interface (124), which interfaces with the e-commerce platform's product database (126) to fetch and rank relevant products; and
(l) outputting the retrieved and ranked products to the user, presenting the most relevant results based on the refined and optimized search query.
6. The method as claimed in claim 5, wherein the search query includes one or more tokens selected from a group comprising individual words, sub-words, or phrases that represent discrete units of meaning within the query.
7. The method as claimed in claim 5, wherein the named entities include brand names, product categories, or specific model numbers.
8. The method as claimed in claim 5, wherein the method further comprises steps of: analyzing the search query for subjective terms or descriptive language; adjusting importance scores based on contextual information; and applying variable thresholds for different types of tokens or query contexts.
| # | Name | Date |
|---|---|---|
| 1 | 202541061513-STATEMENT OF UNDERTAKING (FORM 3) [27-06-2025(online)].pdf | 2025-06-27 |
| 2 | 202541061513-REQUEST FOR EXAMINATION (FORM-18) [27-06-2025(online)].pdf | 2025-06-27 |
| 3 | 202541061513-REQUEST FOR EARLY PUBLICATION(FORM-9) [27-06-2025(online)].pdf | 2025-06-27 |
| 4 | 202541061513-PROOF OF RIGHT [27-06-2025(online)].pdf | 2025-06-27 |
| 5 | 202541061513-POWER OF AUTHORITY [27-06-2025(online)].pdf | 2025-06-27 |
| 6 | 202541061513-FORM-9 [27-06-2025(online)].pdf | 2025-06-27 |
| 7 | 202541061513-FORM 18 [27-06-2025(online)].pdf | 2025-06-27 |
| 8 | 202541061513-FORM 1 [27-06-2025(online)].pdf | 2025-06-27 |
| 9 | 202541061513-DRAWINGS [27-06-2025(online)].pdf | 2025-06-27 |
| 10 | 202541061513-DECLARATION OF INVENTORSHIP (FORM 5) [27-06-2025(online)].pdf | 2025-06-27 |
| 11 | 202541061513-COMPLETE SPECIFICATION [27-06-2025(online)].pdf | 2025-06-27 |