Sign In to Follow Application
View All Documents & Correspondence

Method And System For Artificial Intelligence (Ai) Based Insight Extraction From Format Bound Financial Transaction Data

Abstract: ABSTRACT METHOD AND SYSTEM FOR ARTIFICIAL INTELLIGENCE BASED INSIGHT EXTRACTION FROM FORMAT-BOUND FINANCIAL TRANSACTION DATA A method and system for AI based insight extraction from format-bound financial transaction data includes transforming a structured dataset in ISO format into a transformed dataset having metadata interpretable by a LLM. Metadata includes descriptions of field names, expected values, entity relationships, and business rules. The transformed dataset is analyzed using machine learning models such as regression analysis, principal component analysis, predictive modelling, or anomaly detection. An intent and content of a natural language query are determined using an NLP model. Based on the intent, context, and metadata, the LLM generates a database query, which is executed on the structured dataset to retrieve relevant data. Insights are generated by combining the retrieved data with machine learning results. FIG. 1

Get Free WhatsApp Updates!
Notices, Deadlines & Correspondence

Patent Information

Application #
Filing Date
15 December 2023
Publication Number
25/2025
Publication Type
INA
Invention Field
COMPUTER SCIENCE
Status
Email
Parent Application

Applicants

INTELLECT DESIGN ARENA LIMITED
Plot No. 3/G3 Siruseri, SIPCOT IT Park, Chennai - 600130, Tamil Nadu, India.

Inventors

1. PRASHANT LALCHANDANI
6th Floor, West Wing, Marisoft - III, Marigold Premises, Marigold complex, Kalyani Nagar, Pune – 411014, Maharashtra, India.
2. VIMLENDU MISHRA
6th Floor, West Wing, Marisoft - III, Marigold Premises, Marigold complex, Kalyani Nagar, Pune – 411014, Maharashtra, India.
3. TAPAN AGARWAL
Level 5, E14 5NS, 50 Bank Street, Canary Wharf, London E14 5NS United Kingdom.

Specification

DESC:METHOD AND SYSTEM FOR ARTIFICIAL INTELLIGENCE BASED INSIGHT EXTRACTION FROM FORMAT-BOUND FINANCIAL TRANSACTION DATA
Technical Field
[0001] The embodiments herein relate to financial data processing, and more particularly, to a method and system for artificial intelligence based insight extraction from format-bound financial transaction data.
Description of the Related Art
[0002] In the realm of digital payments, modern payment systems have become increasingly prevalent, driven by the growing adoption of standard regulatory requirements established by industry authorities. One such critical standard is ISO 20022, developed by the International Organization for Standardization (ISO), which has become the industry benchmark for payment transactions. Widely adopted in the banking sector, this standard facilitates seamless message exchange between financial institutions, such as banks and clearing agencies, for both corporate and retail payments.
[0003] The ISO 20022 standard encompasses comprehensive details of payment transactions, including information about creditors, debtors, transaction amounts, intermediary banks, purposes of transactions, and remittance-related data. These structured datasets are essential for ensuring consistency and accuracy in financial transactions across global payment systems.
[0004] However, the rapid growth of the global economy has led to a substantial increase in financial transactions, resulting in the expansion of datasets to millions of records. This poses significant challenges for conventional methods and systems, which are ill-equipped to efficiently analyse such extensive structured datasets.
[0005] Existing solutions have scalability Issues with large datasets. The exponential growth in financial transaction data has created vast datasets, such as those maintained in ISO 20022 formats, that conventional systems struggle to process efficiently for meaningful insights. Further, Existing technologies are not optimized to extract actionable insights from highly structured and schema-based data formats like ISO 20022. Furthermore, the existing solutions lack the capability of dynamically interpreting user queries and provide context-aware responses based on the structured data.
[0006] While Generative Artificial Intelligence (AI) technologies, such as Large Language Models (LLMs), have demonstrated success with unstructured data, their application to structured datasets remains nascent and underexplored. The field lacks established frameworks or industry-wide standards for leveraging Generative AI and machine-learning in analyzing structured datasets within domains like digital payments.
[0007] These limitations hinder the ability of financial institutions to fully utilize their structured datasets for operational efficiencies and decision-making.
[0008] Therefore, there exists a need for an efficient, secure, and reliable analysis of large structured financial datasets to respond to a query of the user.
SUMMARY
[0009] In view of the foregoing, an embodiment herein provides a method for artificial intelligence (AI) based insight extraction from format-bound financial transaction data. The method includes (i) transforming a structured dataset having an ISO format into a transformed dataset having metadata that are interpretable by a Large Language Model (LLM), wherein the metadata includes descriptions of field names, expected values, entity relationships, and business rules associated with the structured dataset, (ii) analyzing the transformed dataset by applying at least one machine learning model, selected from regression analysis, principal component analysis, predictive modelling, or anomaly detection, (iii) determining an intent and a context of a natural language query from a user using a natural language processing (NLP) model, (iv) generating a database query from the natural language query using the LLM based on the intent and the context and the metadata of the structured dataset, (v) retrieving a subset of the transformed dataset by executing the database query on the structured dataset, and (vi) generating a response to the natural language query by combining the retrieved subset with results from the applied machine learning model, the response being at least one of a textual response or a visual response.
[0010] The method is of advantage that the method improves the functioning of a computer by enabling dynamic, conversational interaction between users and the system through NLP and LLM technologies. Unlike traditional business intelligence (BI) and machine learning (ML) solutions that rely on static, menu-driven interfaces, the method utilizes natural language-based interactive conversations. The method provides real-time query resolution and insight generation without requiring prior technical expertise. The method enables interpretation of complex, structured datasets in ISO formats and transforms them into metadata that is interpretable by the LLM, which enables extraction of actionable insights.
[0011] Additionally, the method provides runtime generation of AI/ML models which improves computational efficiency and adaptability by dynamically identifying and applying most appropriate ML model based on the dataset and user query.
[0012] In some embodiments, the metadata is generated by (i) identifying relationships between entities within the structured dataset using foreign key associations, (ii) defining constraints and business rules for validation of transaction data, and (iii) generating examples of natural language queries mapped to SQL queries for training the large language model.
[0013] In some embodiments, the machine learning model includes a time-series forecasting model that is configured to predict future trends based on historical transaction data.
[0014] In some embodiments, the response to the natural language query is a visual response comprising at least one of a dynamically generated graph or chart visualizing patterns, and a heat map representing anomaly detection results.
[0015] In some embodiments, the method includes monitoring user interaction with the generated response to refine the natural language processing model using reinforcement learning based on user feedback.
[0016] In some embodiments, the transformed dataset further comprises error code values as part of the metadata for interpretation by the LLM, wherein the error code values indicate at least one of (a) a success of the financial transaction, or (b) a categorized failure of the financial transaction.
[0017] In another aspect, there is provided a system for artificial intelligence (AI) based insight extraction from format-bound financial transaction data, the system comprising a Large Language Model (LLM), a structured dataset having an ISO format, and an AI based insight extraction server. The AI based insight extraction comprises a memory that stores a database and a set of instructions, and a processor that executes the set of instructions and is configured to (i) transforming a structured dataset having an ISO format into a transformed dataset having metadata that are interpretable by a Large Language Model (LLM), wherein the metadata includes descriptions of field names, expected values, entity relationships, and business rules associated with the structured dataset, (ii) analyzing the transformed dataset by applying at least one machine learning model, selected from regression analysis, principal component analysis, predictive modelling, or anomaly detection, (iii) determining an intent and a context of a natural language query from a user using a natural language processing (NLP) model, (iv) generating a database query from the natural language query using the LLM based on the intent and the context and the metadata of the structured dataset, (v) retrieving a subset of the transformed dataset by executing the database query on the structured dataset, and (vi) generating a response to the natural language query by combining the retrieved subset with results from the applied machine learning model, the response being at least one of a textual response or a visual response.
[0018] The system is of advantage that the system improves functioning of a computer by enabling dynamic, conversational interaction between users and the system through NLP and LLM technologies. Unlike traditional business intelligence (BI) and machine learning (ML) solutions that rely on static, menu-driven interfaces, the system utilizes natural language-based interactive conversations. The system provides real-time query resolution and insight generation without requiring prior technical expertise. The system enables interpretation of complex, structured datasets in ISO formats and transforms them into metadata that is interpretable by the LLM, which enables extraction of actionable insights.
[0019] Additionally, the system provides runtime generation of AI/ML models which improves computational efficiency and adaptability by dynamically identifying and applying most appropriate ML model based on the dataset and user query.
[0020] In some embodiments, the metadata is generated by (i) identifying relationships between entities within the structured dataset using foreign key associations, (ii) defining constraints and business rules for validation of transaction data, and (iii) generating examples of natural language queries mapped to SQL queries for training the large language model.
[0021] In some embodiments, the machine learning model includes a time-series forecasting model that is configured to predict future trends based on historical transaction data.
[0022] In some embodiments, the response to the natural language query is a visual response comprising at least one of a dynamically generated graph or chart visualizing patterns, and a heat map representing anomaly detection results.
[0023] In some embodiments, the processor is further configured to monitor user interaction with the generated response to refine the natural language processing model using reinforcement learning based on user feedback.
[0024] In some embodiments, the transformed dataset further comprises error code values as part of the metadata for interpretation by the LLM, wherein the error code values indicate at least one of (a) a success of the financial transaction, or (b) a categorized failure of the financial transaction.
[0025] These and other aspects of the embodiments herein will be better appreciated and understood when considered in conjunction with the following description and the accompanying drawings. It should be understood, however, that the following descriptions, while indicating preferred embodiments and numerous specific details thereof, are given by way of illustration and not of limitation. Many changes and modifications may be made within the scope of the embodiments herein without departing from the spirit thereof, and the embodiments herein include all such modifications.
BRIEF DESCRIPTION OF THE DRAWINGS
[0026] The embodiments herein will be better understood from the following detailed description with reference to the drawings, in which:
[0027] FIG. 1 is a block diagram of a system for artificial intelligence based insight extraction from format-bound financial transaction data, according to some embodiments herein;
[0028] FIG. 2 is an exploded view of an AI based insight extraction server of FIG. 1, according to some embodiments herein;
[0029] FIG. 3 illustrates an exemplary method for providing a response to a natural language query, in accordance with another embodiment of the present disclosure;
[0030] FIG. 4 illustrates in exemplary system architecture for artificial intelligence based insight extraction from format-bound financial transaction data, according to some embodiments herein;
[0031] FIG. 5 illustrates a process flow for artificial intelligence based insight extraction from format-bound financial transaction data, according to some embodiments herein;
[0032] FIG. 6 is a flow diagram that illustrates a method for artificial intelligence based insight extraction from format-bound financial transaction data, according to some embodiments herein; and
[0033] FIG. 7 is a schematic diagram of a computer architecture in accordance with the embodiments herein.
DETAILED DESCRIPTION OF THE DRAWINGS
[0034] The embodiments herein and the various features and advantageous details thereof are explained more fully with reference to the non-limiting embodiments that are illustrated in the accompanying drawings and detailed in the following description. Descriptions of well-known components and processing techniques are omitted so as to not unnecessarily obscure the embodiments herein. The examples used herein are intended merely to facilitate an understanding of ways in which the embodiments herein may be practiced and to further enable those of skill in the art to practice the embodiments herein. Accordingly, the examples should not be construed as limiting the scope of the embodiments herein.
[0035] The term "large language model" refers to an advanced artificial intelligence (AI) system trained on vast amounts of text data to understand, generate, and manipulate human language. For example, GPT-3.
[0036] The term "format-bound financial transaction data" refers to data related to financial transactions that adhere to a predefined, structured format, such as an ISO standard, ensuring consistency and compatibility for processing.
[0037] The term "transformed dataset" refers to a dataset that has been converted from its original structured format into a modified format containing metadata, making it interpretable by advanced models like a Large Language Model (LLM).
[0038] The term “metadata” refers to any data that provides supplementary context, descriptive information, or attributes about the format-bound financial transaction data. This includes, but is not limited to, annotations, tags, timestamps, source identifiers, processing parameters, historical data, user interaction data, security attributes, ISO-formatted data, and any other relevant information that supports the interpretation, management, or application of the primary data. This inclusive definition applies broadly across the financial technology sector, encompassing areas such as transaction processing, data analysis, compliance, and security. Metadata serves to enhance financial systems by improving data quality, ensuring regulatory compliance, enabling more accurate analysis, and delivering personalized and efficient financial services.
[0039] The term "database query" refers to a structured request generated to retrieve specific information from a database, often written in a query language such as SQL.
[0040] The term "intent and context of a natural language query" refers to the underlying purpose and situational meaning extracted from a natural language query provided by a user.
[0041] As mentioned there is a need for an efficient, secure, and reliable analysis of large structured financial datasets to respond to a query of the user. Embodiments herein provide a method and system for artificial intelligence (AI) based insight extraction from format-bound financial transaction data. Referring now to the drawings, and more particularly to FIGS. 1 through 7, where similar reference characters denote corresponding features consistently throughout the figures, preferred embodiments are shown.
[0042] FIG. 1 is a block diagram of a system 100 for artificial intelligence based insight extraction from format-bound financial transaction data, according to some embodiments herein. The system 100 includes an input unit 102, a processing unit 104, a display unit 108, a data communication network 106 and an AI based insight extraction server 110. The input unit 102 is adapted to allow the user to provide a natural language query to the system 100 to receive the output or response of the natural language query. In an example, the user may use a keypad, touchpad, touch interface, or mouse to input one or more queries for which the user wants a response in the natural language. In yet another example, the user may use an audio input device such as a microphone and the audio signals obtained by the microphone are converted into text which is used as a natural language query.
[0043] According to an embodiment of the present disclosure, a query may be received from the user to know the payment trends of the last five years. In another example, the query may be received from the user for identification of patterns and trends in the payments of ABC company in the last 2 months.
[0044] The processing unit 104 is configured to analyze the natural language query received from the user. The processing unit 104 analyze the query using at least one trained model. In an example, the trained model may include at least an Artificial Intelligence (AI) based model and/or a Machine Learning (ML) based Natural Language Processing (NLP) model. The Artificial Intelligence (AI) based model may include a generative pre-trained transformer (GPT) and the bidirectional encoder representations from transformers (BERT) models. The Natural Language query is analyzed by the processing unit 104 to identify the intent and context of the query. In an exemplary embodiment, machine learning may include supervised learning algorithms or unsupervised learning algorithms.
[0045] The processing unit 104 is further configured to classify the input text, or text obtained from audio signal, received from the user. For example, the processing unit 104 may classify the received text to understand whether the user is performing a generic conversation with the system 100 or wants to retrieve any specific data from the database. For example, the processing unit 104 may be configured to determine whether the received text includes a generic command, or the received text includes a specific command e.g., natural language query, to retrieve any specific data from the database. Accordingly, the processing unit may classify the received text as one of generic conversation / command or natural language query.
[0046] After classifying the text as the natural language query, the processing unit 104 is further configured to convert the natural language query into a database-specific query to extract the response of the query from the financial database. In a non-limiting embodiment, the financial database corresponds to the ISO dataset stored at the AI based insight extraction server 110 such as a local server or cloud server. The ISO dataset may correspond to the dataset related to the financial domain.
[0047] In an embodiment, the financial data associated with the communication is uploaded on the AI based insight extraction server 110 on a real-time basis. The financial data is uploaded to the AI based insight extraction server 110 through a data communication network 106. In the present disclosure, the terms “server” and “storage unit” may be used interchangeably and provide the same meaning. The data communication network 106, as used herein, may include, but is not limited to, any telecommunication networks for the transmission of data. Further, the data communication network 106 includes but is not limited to, all internet networks such as HTTP, FTP, IEEE protocols, and radio access wireless networks such as 2G, 3G, LTE, 4G, 5G, and the like and non-internet data communication network 106 such as CDMA, TDMA. The data communication network 106 facilitates the upload of financial data to and from one point to another point. Further, the data communication network 106 acts as a medium to transfer data from one end to another end.
[0048] The processing unit 104 connected with the AI based insight extraction server 110, is configured to first create metadata for the financial dataset e.g., ISO formatted data with an explanation of field names and expected values in the dataset.
[0049] According to an embodiment of the present disclosure, metadata for example may include, but not limited to, descriptive field names given to fields having long names within the ISO format that are indecipherable by the LLM. Through generating descriptive field names in a natural language format as metadata, the LLM may be able to interpret the metadata and, thereby, the format-bound financial transaction data better.
[0050] In an embodiment, creating metadata for the financial data e.g., ISO formatted data includes converting the data structure or datasets into field names that are interpretable by the LLM model, and incorporating additional details that describe the expected values in the database. For example, the metadata supplied to the LLM may also be error code values; for example, error code 0 may be interpreted as a transactional "success." Error codes 1, 2, and 3 may be associated with one form of transactional "failure" and error codes 4, 5, and 6 may be associated with another form of transactional "failure." The error codes supplied as metadata may provide more context for interpretation to the LLM.
[0051] In order to render the format-bound financial data understandable by the LLM, the financial transaction in the specific format (e.g., ISO 20022) may be flattened out, say, into a single row of data. Definitions to the LLM provided may include but are not limited to parties involved in a transaction, transaction amount, transaction settlement date and descriptive format; the LLM may understand the data, the metadata and/or the definitions input thereto to enable generation of derivable insights.
[0052] Still further, the metadata generation and subsequent operations are not limited to being triggered in response to a query from a user or a natural language query as discussed in Fig. 3 (see below). Other triggers such as changes in account balance and automatic metadata generation and subsequent analyses are within the scope of the embodiments of the present disclosure.
[0053] In an example, the created metadata or required description may then passed on to the Artificial Intelligence (AI) based model e.g., a generative pre-trained transformer (GPT) or a bidirectional encoder representation from transformers (BERT) model, so that it can be used to generate SQL query and also to create filters for analytics purposes. In another example, the natural language query may include “identify the patterns in the payment of ABC company in last 2 months”. The indication “payment of ABC company in last 2 months” may be used to retrieve data for ABC company for last 2 months based on generated SQL query using trained model.
[0054] According to an embodiment of the present disclosure, the processing unit 104 may be configured to run or execute analytics (e.g., pre-defined or dynamically defined analytics) or queries based on the natural language query on the financial dataset e.g., ISO formatted data. In an example, the step of running or executing analytics (e.g., pre-defined or dynamically defined analytics) or queries may be based on linear regressions models, principal component analysis, prediction and forecasting model on the financial dataset e.g., ISO formatted data. In yet another example, the step of running or executing analytics (e.g., pre-defined or dynamically defined analytics) or queries may be based on a neural network model, e.g., convolutional neural network (CNN) model.
[0055] According to another embodiment of the present disclosure, the processing unit 104 is configured to create or determine prompts such as database-specific query that provide a context to the trained model e.g., LLM model. The determination of prompts such as database-specific query is based on the created metadata and analytics (e.g., pre-defined or dynamically defined analytics). Further, prompts such as database-specific query provide the context to the trained model in addition to the user’s query so that the trained model is configured to determine the desired results in an efficient manner.
[0056] According to another embodiment of the present disclosure, the natural language query may also include an indication relating to the intent of the user. For example, the natural language query may include “identify the patterns in the payment of ABC company in the last 2 months”. The indication “identify the pattern” relates to the intent of the user while the indication “payment of ABC company in last 2 months” may be used to retrieve data for ABC company for last 2 months based on generated SQL query using trained model.
[0057] In accordance with an implementation, the processing unit 104 may use the indication in the natural language query to identify or determine filters and analytics. For example, such identification or determination of filters and analytics defining the intent of the user may be obtained based on the trained model. In another example, the processing unit 104 may be configured to generate a computer program code e.g., a python code, using the trained model to determine filters and analytics defining the intent of the user.
[0058] According to another embodiment of the present disclosure, the processing unit 104 is configured to extract a response to the natural language query from the financial dataset based on the determined filters and analytics. For example, the processing unit 104 determines a response to the query based on the determined filters and analytics.
[0059] The system 100 also includes the display unit 108. The display unit 108 is connected to the processing unit 104. The display unit 108 is configured to display the response of the natural language query to the user. In an exemplary embodiment, the response is displayed to the user in a visualization form such as text or images. In an example, the response may be displayed in the form of charts, graphs, tables, trends, and the like. In an example, the user displays the response to the query in the natural language. In another example, a response to the query related to payment trends in the last 2 months may be displayed to the user in a graph representing the number of months on the x-axis and payment trends in the y-axis.
[0060] FIG. 2 is an exploded view of the AI based insight extraction server 110 of FIG. 1, according to some embodiments herein. The AI based insight extraction server 110 includes a transformed dataset generation module 202, a transformed dataset analysis module 204, intent and context determination module 206, a database query generation module 208, database query execution module 210, and a natural language query response generation module 212.
[0061] The transformed dataset generation module 202 transforms a structured dataset having an ISO format into a transformed dataset having metadata that are interpretable by a Large Language Model (LLM). The metadata includes descriptions of field names, expected values, entity relationships, and business rules associated with the structured dataset. For example, the metadata may include detailed schema definitions such as the description of entities, relationships, column-level details, and business rules. These schema definitions provide detailed context for the payments dataset, including annotations like table names, column types, constraints, and foreign key associations. Business rules, such as identifying high-value transactions or validating transaction thresholds, may also be embedded in the metadata.
[0062] The metadata may include (a) Description of Entities: Detailed information about each table in the database, Purpose and context of each entity, How entities relate to real-world concepts or business objects, (b) Relationships: Explanation of how different entities are connected, Types of relationships (one-to-one, one-to-many, many-to-many), Foreign key associations between table, (c) Column Descriptions: Detailed information about each column in every table, Data types, constraints, and allowed values, Meaning and significance of each field in business terms, (d) Business Rules: Guidelines for data interpretation, Validation rules and constraints, Calculation methods for derived fields, Specific business logic applied to the data, (e) Example Queries: Natural language questions that users might ask, Corresponding SQL queries for these questions, Explanation of how these queries relate to business needs.
[0063] The transformed dataset analysis module 204 analyzes the transformed dataset by applying at least one machine learning model, selected from regression analysis, principal component analysis, predictive modelling, or anomaly detection. This step enables the application of pre-defined analytics such as descriptive analytics (e.g., transaction volume trends, average transaction values, customer segmentation) and advanced models, including ARIMA for time-series analysis, random forest for classification, or isolation forest for anomaly detection. The process involves cleaning and normalizing data, running predictive models, and deriving insights, such as forecasting transaction volumes or identifying fraudulent behaviour. The analysis includes running pre-defined analytics and machine learning (ML) models on the payments data model. Examples of the analytics process include:

a. Descriptive Analytics:
Analyzing transaction volumes across daily, weekly, and monthly trends.
Calculating the average transaction value for different periods.
Evaluating payment method distribution across various categories (e.g., credit cards, bank transfers, UPI).
Identifying transaction distribution across payment rails, such as NEFT, RTGS, or UPI.
Segmenting customers based on their payment behavior, allowing for targeted insights and personalized recommendations.
b. Time Series Analysis:
Detecting seasonal patterns in payment volumes, such as increased transactions during holidays.
Analyzing trends across different payment methods over time.
Forecasting future transaction volumes based on historical data.
c. Anomaly Detection:
Identifying anomalies in transactions to flag potentially fraudulent activity.
Measuring false positive and false negative rates in existing fraud detection systems to improve accuracy.
ML Models for Payments Data:
a. Clustering Models:
Using K-means or DBSCAN to segment customers based on payment behaviors.
Employing hierarchical clustering to identify groups of similar transactions.
b. Classification Models:
Leveraging Random Forest or XGBoost for fraud detection with high accuracy.
Utilizing Support Vector Machines (SVM) for categorizing transactions into predefined types.
Applying Naive Bayes models for scalable, quick classification of payment types.
c. Regression Models:
Running Linear Regression to predict transaction values based on historical trends.
Employing Poisson Regression for forecasting the number of transactions during a specific period.
d. Time Series Models:
Implementing ARIMA (AutoRegressive Integrated Moving Average) to forecast payment volumes.
Using Prophet for detecting seasonal trends and making predictions.
Applying LSTM (Long Short-Term Memory) neural networks for analyzing and forecasting complex time series patterns in payment data.
e. Anomaly Detection Models:
Deploying Isolation Forest to detect unusual or outlier transactions.
Utilizing One-Class SVM for identifying rare patterns in payment activities.
f. Association Rule Learning:
Applying the Apriori algorithm to discover relationships between various payment attributes.
g. Deep Learning Models:
Using Neural Networks for recognizing intricate patterns in transaction data.
Applying Autoencoders for dimensionality reduction and learning key features of the data.
[0064] The intent and context determination module 206 determines an intent and a context of a natural language query from a user using a natural language processing (NLP) model. An orchestrator layer may process the user input and identify key attributes, such as user intent (e.g., payment trends or anomaly detection). For example, the orchestrator layer may use metadata to match user queries, such as "Identify seasonal trends in transactions," with relevant data structures and relationships.
[0065] The database query generation module 208 generates a database query from the natural language query using the LLM based on the intent and the context and the metadata of the structured dataset. The module applies a runtime conversion process where the LLM analyses the query against the metadata, identifies relevant tables, relationships, and business rules, and constructs a structured SQL query to retrieve data. For instance, metadata for a "Transactions" table may include columns like transaction_id, amount, and transaction_date, enabling the LLM to create queries such as "Retrieve transactions above $10,000 in the last month.".
[0066] The database query execution module 210 retrieves a subset of the transformed dataset by executing the database query on the structured dataset.
[0067] The natural language query response generation module 212 generates a response to the natural language query by combining the retrieved subset with results from the applied machine learning model. The response may include a textual explanation or a visual output, such as dynamically generated graphs, charts, or heat maps. For example, the module can display customer segmentation insights, fraud detection results, or anomaly patterns in transaction data.
[0068] In some embodiments, the metadata is generated by (i) identifying relationships between entities within the structured dataset using foreign key associations, (ii) defining constraints and business rules for validation of transaction data, and (iii) generating examples of natural language queries mapped to SQL queries for training the large language model.
[0069] In some embodiments, the machine learning model includes a time-series forecasting model that is configured to predict future trends based on historical transaction data.
[0070] In some embodiments, the response to the natural language query is a visual response comprising at least one of a dynamically generated graph or chart visualizing patterns, and a heat map representing anomaly detection results.
[0071] In some embodiments, user interaction is monitored with the generated response to refine the natural language processing model using reinforcement learning based on user feedback.
[0072] In some embodiments, the transformed dataset further comprises error code values as part of the metadata for interpretation by the LLM, wherein the error code values indicate at least one of (a) a success of the financial transaction, or (b) a categorized failure of the financial transaction.
[0073] FIG. 3 illustrates an exemplary method for providing a response to a natural language query, in accordance with another embodiment of the present disclosure.
[0074] At step 302, the system receives natural language (NL) text as input from a user. The input may be processed by an orchestrator layer that identifies the intent and context of the query. The orchestrator layer coordinates various tasks within the system to ensure the seamless functioning of the overall process.
[0075] At step 304, the system performs a classification call to determine whether the input natural language query requires a database query or other types of responses. The classification step involves using trained models, such as GPT or similar large language models (LLMs), to interpret the user's query.
[0076] At step 306, the system decides if a database (DB) query is needed. If the query does not involve database retrieval, the system proceeds to step 314, where a chat-based response is generated directly by the LLM. This response can include textual or visual outputs, which are delivered to the user at step 318.
[0077] If a database query is required, the system moves to step 308, where a structured query (e.g., SQL) is generated based on the intent of the natural language query. The metadata created for the financial dataset and analytics, including pre-defined or dynamically defined analytics, are used to create prompts for the LLM to understand the intent of the user. For instance, a query like "Identify the patterns in the payments of ABC company in the last 2 months" will identify relevant filters and analytics, leveraging the metadata and analytics capabilities.
[0078] At step 310, the system retrieves data from the database (ISO DB) based on the generated SQL query. This ensures that the relevant subset of the structured dataset is extracted for further processing. At step 312, the retrieved data is loaded into a data frame for additional analysis. This step prepares the dataset for operations using analytics tools such as Python libraries.
[0079] At step 316, the system uses Pandas AI to perform advanced analytics. Within step 316 module, the system generates Python code for specific filters and executes the code on the data frame. The filters and analytics are dynamically determined based on the query and the metadata. Open-source Python libraries may be utilized to process the dataset, identify patterns, and perform other analytical tasks. Within step 316, the system generates a natural language response to the query of the user. This response includes both textual and visual components, such as dynamically generated graphs or charts and heat maps representing analytical results.
[0080] At step 318, the response, including the visualized results, is displayed to the user. This completes the interactive cycle, allowing the user to view actionable insights or patterns derived from the processed financial dataset.
[0081] Steps 304, 308, 314 and sub-steps of step 316 may be performed by utilizing a Large Language Model.
[0082] In an embodiment, the system provides a conversational NLP-based agent to interact with the user and address financial-related queries, ensuring accurate and contextually relevant responses to enhance decision-making processes.
[0083] FIG. 4 illustrates in exemplary system architecture for for artificial intelligence based insight extraction from format-bound financial transaction data, according to some embodiments herein.
[0084] The natural language query input receiver 402 receives a natural language query from the user that captures a question or command of the user in human-readable language to initiate processing within the system. The user query processing unit 404 analyzes the received query to identify its purpose and classifies the query as either a generic query or a specific query, determining the type of response or further processing required.
[0085] The data processing unit 408 accesses the database, represented by database 410, to retrieve the data relevant to the query. The metadata extraction unit 406 extracts metadata related to the query and identifies necessary contextual information, such as field names, relationships, or data constraints, to determine what data needs to be accessed to generate the response.
[0086] The metadata processing and analytics unit 412 refines the extracted metadata into a structured format. The metadata processing and analytics unit 412 applies analytics to the structured metadata. The query converter unit 414 converts the refined metadata into a database-specific query, such as SQL.
[0087] The response retrieval unit 416 retrieves the results of the database-specific query executed on the database. The response display unit 418 displays the final response to the user. The response display unit 418 may present the output in textual or visual formats, such as tables, graphs, or charts, providing a clear and actionable response to the query of the user.
[0088] FIG. 5 illustrates a process flow for artificial intelligence based insight extraction from format-bound financial transaction data, according to some embodiments herein. At step 502, the system receives a user query in natural language. This input initiates the query processing flow. At step 504, the system determines whether the received query is a specific query or not. If the query is generic (NO path), the system proceeds to step 520. If the query is specific (YES path), the system moves to step 506. At step 506, the system extracts relevant metadata associated with the specific query. This metadata provides the necessary context and structure for further processing. At step 508, the system converts the extracted metadata into a structured format to enable efficient analysis and processing. At step 510, the system derives insights from the structured metadata. These insights help define the necessary data operations or filters to address the user's query. At step 512, the system converts the processed metadata and insights into a database-specific query, such as SQL, for retrieving relevant data. At step 514, the system determines the response to the query by executing the database-specific query on the relevant dataset.
[0089] At step 516, the system retrieves the response from the database. At step 518, the system displays the retrieved response to the user. This response may include textual or visual elements such as charts or graphs.
[0090] For generic queries (NO path from step 504), at step 520, the system generates a chat response using predefined logic or AI models. The response directly addresses the user's generic query. At step 522, the system displays the response to the user.
[0091] FIG. 6 is a flow diagram that illustrates a method for artificial intelligence based insight extraction from format-bound financial transaction data, according to some embodiments herein. At step 602, the method comprises transforming a structured dataset having an ISO format into a transformed dataset having metadata that are interpretable by a Large Language Model (LLM), wherein the metadata includes descriptions of field names, expected values, entity relationships, and business rules associated with the structured dataset. At step 604, the method comprises analyzing the transformed dataset by applying at least one machine learning model, selected from regression analysis, principal component analysis, predictive modelling, or anomaly detection. At step 606, the method comprises determining an intent and a context of a natural language query from a user using a natural language processing (NLP) model. At step 608, the method comprises generating a database query from the natural language query using the LLM based on the intent and the context and the metadata of the structured dataset. At step 610, the method comprises retrieving a subset of the transformed dataset by executing the database query on the structured dataset. At step 612, the method comprises generating a response to the natural language query by combining the retrieved subset with results from the applied machine learning model, the response being at least one of a textual response or a visual response.
[0092] The method is of advantage that the method improves functioning of a computer by enabling dynamic, conversational interaction between users and the system through NLP and LLM technologies. Unlike traditional business intelligence (BI) and machine learning (ML) solutions that rely on static, menu-driven interfaces, the method utilizes natural language-based interactive conversations. The method provides real-time query resolution and insight generation without requiring prior technical expertise. The method enables interpretation of complex, structured datasets in ISO formats and transforms them into metadata that is interpretable by the LLM, which enables extraction of actionable insights.
[0093] Additionally, the method provides runtime generation of AI/ML models which improves computational efficiency and adaptability by dynamically identifying and applying most appropriate ML model based on the dataset and user query.
[0094] In some embodiments, the metadata is generated by (i) identifying relationships between entities within the structured dataset using foreign key associations, (ii) defining constraints and business rules for validation of transaction data, and (iii) generating examples of natural language queries mapped to SQL queries for training the large language model.
[0095] In some embodiments, the machine learning model includes a time-series forecasting model that is configured to predict future trends based on historical transaction data.
[0096] In some embodiments, the response to the natural language query is a visual response comprising at least one of a dynamically generated graph or chart visualizing patterns, and a heat map representing anomaly detection results.
[0097] In some embodiments, the method includes monitoring user interaction with the generated response to refine the natural language processing model using reinforcement learning based on user feedback.
[0098] In some embodiments, the transformed dataset further comprises error code values as part of the metadata for interpretation by the LLM, wherein the error code values indicate at least one of (a) a success of the financial transaction, or (b) a categorized failure of the financial transaction.
[0099] The embodiments herein may include a computer program product configured to include a pre-configured set of instructions, which when performed, can result in actions as stated in conjunction with the methods described above. In an example, the pre-configured set of instructions can be stored on a tangible non-transitory computer readable medium or a program storage device. In an example, the tangible non-transitory computer readable medium can be configured to include the set of instructions, which when performed by a device, can cause the device to perform acts similar to the ones described here. Embodiments herein may also include tangible and/or non-transitory computer-readable storage media for carrying or having computer executable instructions or data structures stored thereon.
[00100] Generally, program modules utilized herein include routines, programs, components, data structures, objects, and the functions inherent in the design of special-purpose processors, etc. that perform particular tasks or implement particular abstract data types. Computer executable instructions, associated data structures, and program modules represent examples of the program code means for executing steps of the methods disclosed herein. The particular sequence of such executable instructions or associated data structures represents examples of corresponding acts for implementing the functions described in such steps.
[00101] The embodiments herein can include both hardware and software elements. The embodiments that are implemented in software include but are not limited to, firmware, resident software, microcode, etc.
[00102] A data processing system suitable for storing and/or executing program code will include at least one processor coupled directly or indirectly to memory elements through a system bus. The memory elements can include local memory employed during actual execution of the program code, bulk storage, and cache memories which provide temporary storage of at least some program code in order to reduce the number of times code must be retrieved from bulk storage during execution.
[00103] Input/output (I/O) devices (including but not limited to keyboards, displays, pointing devices, etc.) can be coupled to the system either directly or through intervening I/O controllers. Network adapters may also be coupled to the system to enable the data processing system to become coupled to other data processing systems or remote printers or storage devices through intervening private or public networks. Modems, cable modem and Ethernet cards are just a few of the currently available types of network adapters
[00104] A representative hardware environment for practicing the embodiments herein is depicted in FIG. 7, with reference to FIGS. 1 through 6. This schematic drawing illustrates a hardware configuration of a server/computer system/user device in accordance with the embodiments herein. The hardware includes at least one processing device 10. The special-purpose CPUs 10 are interconnected via system bus 12 to various devices such as a random-access memory (RAM) 14, read-only memory (ROM) 16, and an input/output (I/O) adapter 18. The I/O adapter 18 can connect to peripheral devices, such as disk units 11 and tape drives 13, or other program storage devices that are readable by the system. The hardware can read the inventive instructions on the program storage devices and follow these instructions to execute the methodology of the embodiments herein. The hardware further includes a user interface adapter 19 that connects a keyboard 15, mouse 17, speaker 24, microphone 22, and/or other user interface devices such as a touch screen device (not shown) to the bus 12 to gather user input. Additionally, a communication adapter 20 connects the bus 12 to a data processing network 25, and a display adapter 21 connects the bus 12 to a display device 23, which provides a graphical user interface (GUI) 29 of the output data in accordance with the embodiments herein, or which may be embodied as an output device such as a monitor, printer, or transmitter, for example. Further, a transceiver 26, a signal comparator 27, and a signal converter 28 may be connected with the bus 12 for processing, transmission, receipt, comparison, and conversion of electric or electronic signals.
[00105] The foregoing description of the specific embodiments will so fully reveal the general nature of the embodiments herein that others can, by applying current knowledge, readily modify and/or adapt for various applications such specific embodiments without departing from the generic concept, and, therefore, such adaptations and modifications should and are intended to be comprehended within the meaning and range of equivalents of the disclosed embodiments. It is to be understood that the phraseology or terminology employed herein is for the purpose of description and not of limitation. Therefore, while the embodiments herein have been described in terms of preferred embodiments, those skilled in the art will recognize that the embodiments herein can be practiced with modification within the scope.

Dated this 13th day of December 2024

Sachin Manocha
[IN/PA-3247]
Of KRIA Law
Agent for Applicant
,CLAIMS:CLAIMS
What is claimed is:

1. A method for artificial intelligence (AI) based insight extraction from format-bound financial transaction data, the method comprising:
transforming a structured dataset having an ISO format into a transformed dataset having metadata that are interpretable by a Large Language Model (LLM), wherein the metadata includes descriptions of field names, expected values, entity relationships, and business rules associated with the structured dataset;
analyzing the transformed dataset by applying at least one machine learning model, selected from regression analysis, principal component analysis, predictive modelling, or anomaly detection;
determining an intent and a context of a natural language query from a user using a natural language processing (NLP) model;
generating a database query from the natural language query using the LLM based on the intent and the context and the metadata of the structured dataset;
retrieving a subset of the transformed dataset by executing the database query on the structured dataset; and
generating a response to the natural language query by combining the retrieved subset with results from the applied machine learning model, the response being at least one of a textual response or a visual response.

2. The method of claim 1, wherein the metadata is generated by:
(i) identifying relationships between entities within the structured dataset using foreign key associations;
(ii) defining constraints and business rules for validation of transaction data; and
(iii) generating examples of natural language queries mapped to SQL queries for training the large language model.

3. The method of claim 1, wherein the machine learning model includes a time-series forecasting model that is configured to predict future trends based on historical transaction data.

4. The method of claim 1, wherein the response to the natural language query is a visual response comprising at least one of a dynamically generated graph or chart visualizing patterns, and a heat map representing anomaly detection results.

5. The method of claim 1, further comprising monitoring user interaction with the generated response to refine the natural language processing model using reinforcement learning based on user feedback.

6. The method of claim 1, wherein the transformed dataset further comprises error code values as part of the metadata for interpretation by the LLM, wherein the error code values indicate at least one of (a) a success of the financial transaction, or (b) a categorized failure of the financial transaction.

7. A system of artificial intelligence (AI) based insight extraction from format-bound financial transaction data, wherein the system comprises:
a Large Language Model (LLM);
a structured dataset having an ISO format; and
an AI based insight extraction server, comprising:
a memory that stores a database and a set of instructions; and
a processor that executes the set of instructions and is configured to:
transforming a structured dataset having an ISO format into a transformed dataset having metadata that are interpretable by a Large Language Model (LLM), wherein the metadata includes descriptions of field names, expected values, entity relationships, and business rules associated with the structured dataset;
analyzing the transformed dataset by applying at least one machine learning model, selected from regression analysis, principal component analysis, predictive modelling, or anomaly detection;
determining an intent and a context of a natural language query from a user using a natural language processing (NLP) model;
generating a database query from the natural language query using the LLM based on the intent and the context and the metadata of the structured dataset;
retrieving a subset of the transformed dataset by executing the database query on the structured dataset; and
generating a response to the natural language query by combining the retrieved subset with results from the applied machine learning model, the response being at least one of a textual response or a visual response.

8. The system of claim 7, wherein the metadata is generated by:
(i) identifying relationships between entities within the structured dataset using foreign key associations;
(ii) defining constraints and business rules for validation of transaction data; and
(iii) generating examples of natural language queries mapped to SQL queries for training the large language model.


9. The system of claim 7, wherein the machine learning model includes a time-series forecasting model that is configured to predict future trends based on historical transaction data.

10. The system of claim 7, wherein the response to the natural language query is a visual response comprising at least one of a dynamically generated graph or chart visualizing patterns, and a heat map representing anomaly detection results.

11. The system of claim 7, wherein the processor is further configured to monitor user interaction with the generated response to refine the natural language processing model using reinforcement learning based on user feedback.

12. The system of claim 7, wherein the transformed dataset further comprises error code values as part of the metadata for interpretation by the LLM, wherein the error code values indicate at least one of (a) a success of the financial transaction, or (b) a categorized failure of the financial transaction.

Dated this 13th day of December 2024

Sachin Manocha
[IN/PA-3247]
Of KRIA Law
Agent for Applicant

Documents

Application Documents

# Name Date
1 202341062349-STATEMENT OF UNDERTAKING (FORM 3) [15-09-2023(online)].pdf 2023-09-15
2 202341062349-PROVISIONAL SPECIFICATION [15-09-2023(online)].pdf 2023-09-15
3 202341062349-POWER OF AUTHORITY [15-09-2023(online)].pdf 2023-09-15
4 202341062349-FORM 1 [15-09-2023(online)].pdf 2023-09-15
5 202341062349-DRAWINGS [15-09-2023(online)].pdf 2023-09-15
6 202341062349-Proof of Right [11-10-2023(online)].pdf 2023-10-11
7 202341062349-POA [10-09-2024(online)].pdf 2024-09-10
8 202341062349-FORM 13 [10-09-2024(online)].pdf 2024-09-10
9 202341062349-APPLICATIONFORPOSTDATING [10-09-2024(online)].pdf 2024-09-10
10 202341062349-AMENDED DOCUMENTS [10-09-2024(online)].pdf 2024-09-10
11 202341062349-DRAWING [13-12-2024(online)].pdf 2024-12-13
12 202341062349-CORRESPONDENCE-OTHERS [13-12-2024(online)].pdf 2024-12-13
13 202341062349-COMPLETE SPECIFICATION [13-12-2024(online)].pdf 2024-12-13
14 202341062349-Response to office action [07-01-2025(online)].pdf 2025-01-07
15 202341062349-Annexure [07-01-2025(online)].pdf 2025-01-07
16 202341062349-Request Letter-Correspondence [08-01-2025(online)].pdf 2025-01-08
17 202341062349-Request Letter-Correspondence [08-01-2025(online)]-1.pdf 2025-01-08
18 202341062349-Power of Attorney [08-01-2025(online)].pdf 2025-01-08
19 202341062349-Form 1 (Submitted on date of filing) [08-01-2025(online)].pdf 2025-01-08
20 202341062349-Form 1 (Submitted on date of filing) [08-01-2025(online)]-1.pdf 2025-01-08
21 202341062349-Covering Letter [08-01-2025(online)].pdf 2025-01-08
22 202341062349-Covering Letter [08-01-2025(online)]-1.pdf 2025-01-08
23 202341062349-CERTIFIED COPIES TRANSMISSION TO IB [08-01-2025(online)].pdf 2025-01-08
24 202341062349-CERTIFIED COPIES TRANSMISSION TO IB [08-01-2025(online)]-1.pdf 2025-01-08