Abstract: Abstract INTELLIGENT ANALYSIS OF DOCUMENT DRIVEN QA CHABOT The Intelligent Analysis of Document Driven QA Chabot gets input from the user. The UI is made in such a way that it can get simple text as input from the text box (100), text documents (101), and even images (102). After data collection, here comes the data preprocessing (103). The text derived from the text box and the text document doesn’t require external preprocessing since the simple vector model does it for us. The image that is given as input contains text data (104). The image is preprocessed using a computer vision module called cv2. The image is converted into its grayscale and the noise is removed. Contours are added and the font is dilated. The text from the image is extracted using a module called pytesseract. From now on the process is the same for all three types of files. The text being extracted is provided as input to a module called GPTSimpleVectorIndex (105). This module performs vector embedding to the raw text. The text is converted into an index file where the file contains key-value pairs (106). The bot is built on this indexed file such that when a user queries a question (107), it chooses a relevant value among all the key-value pairs and returns the answer (108).
Description:FIELD OF INVENTION
The aim of this project is to provide a chatbot that answers anything about your document and clarifies all your queries about the document you uploaded. This project is designed with a simple user interface that aids an effective experience in knowing their document. For example, consider a student who visits the library. He wants to search for a detail in a book. Successfully he found the book but the book comprises more than 500 pages. When he is provided with an E-book of the same, he can just upload the book in this app and the bot starts modeling upon the book he uploaded. Now he can ask anything about the book, the definition for which he needs clarification, and so on. The bot replies within a fraction of a second. This Intelligent Document Analysis can be applied to any kind of document under various domains. Another example would be, there are many documents generated for each case in a jurisdiction/court. The case documents get increased after each hearing. It will be difficult for the jury to conclude. When they use our application, it becomes easy for them to get guidance and clarification in any corner of the files and documents. These are just sample use cases.
Intelligent Document Analysis is the process of extracting, processing, and embedding pieces of information from the document into an indexed file that retrieves answers to the user query employing a chatbot that is modeled upon the indexed file. This Intelligent Document Analysis is generic to every domain that uses documents. It is operationally efficient to solve various business problems and helps an individual to know about his document.
BACKGROUND OF INVENTION
In this section, we provide a background study on intelligent analysis of document-driven QA chatbots, discussi[n2]g the key aspects of natural language understanding, document retrieval, question answering, and conversational AI.
Natural Language Understanding (NLU):
NLU is a fundamental aspect of intelligent document-driven QA chatbots, as it enables the systems to understand and process user inputs. Manning et al. (2008) provide a comprehensive overview of NLU techniques, which involve tokenization, part-of-speech tagging, and parsing, among other tasks [1]. Young et al. (2018) discuss the impact of deep learning on NLU, which has significantly improved the performance of various NLU tasks [2].
Document Retrieval:
Retrieving relevant documents is essential for document-driven QA chatbots. Baeza-Yates and Ribeiro-Neto (2011) provide a detailed introduction to information retrieval, including the indexing, ranking, and evaluation of documents. Manning et al. (2009) present an extensive study on information retrieval techniques, including probabilistic models, language models, and vector space models.
Question Answering:
QA systems have evolved from rule-based to machine learning-based approaches, with deep learning methods achieving significant success. Jurafsky and Martin (2021) discuss various QA techniques, including pattern matching, information extraction, and machine learning-based approaches. Rajpurkar et al. (2016) introduced the Stanford Question Answering Dataset (SQuAD), a widely used benchmark for evaluating the performance of QA
systems.
[3]
Conversational AI and Chatbots:
Chatbots have become popular tools for customer support, e-commerce, and various other applications. Shawar and Atwell (2007) provide an early overview of chatbot technologies and their applications. Gao et al. (2019) present a comprehensive survey on neural approaches to conversational AI, including sequence-to-sequence models, hierarchical models, and reinforcement learning-based methods.
By analyzing the literature on NLU, document retrieval, question answering, and conversational AI, we can better understand the various components and challenges involved in developing intelligent document-driven QA chatbots.
PROBLEM STATEMENT
The documents generated around the world to date are humungous. It is formidable to search for a single thing in a single document. Moreover, the traditional scanning system for specific information may involve querying or formulations. A layman cannot make queries or program codes to extract the information he needs from a text file. The efficiency of the work is afflicted to all ranging from an individual to multinationals. The objective of this project is to reduce the latency time between searching and retrieving specific information under individual or business needs. The person who is conversing with the bot need not be expertise.
SUMMARY OF THE INVENTION
OCR
OCR is abbreviated to Optical Character Recognition. It is an application of Computer Vision that is used to extract text from an image. The library that we use in OCR to extract the content is[4p]ytesseract. The Tesseract module is designed to compute dark regions by concentrating on them from a volume of
data. When the Intelligent Document Analysis takes an image as input, this methodology of processing and extracting information is followed. Each page of a PDF document is considered an image and is processed through the same pipeline that goes for an image.
NLP
Natural Language Processing is a branch of Artificial Intelligence that is concerned with providing the computer, with the knowledge to understand text, here to understand documents. It translates the text from the document into a machine-processable form for the purpose of analysis. NLP can accurately determine the questionnaire's intention and provide the exact response. The models used are listed below
1. LLM
2. Langchain
3. GPT
The large Language Model is a Machine Learning model that is used to solve many of the concerns in NLP. LLM can be applied in a wide range of use cases of which we are going to use 3 of its use cases such as generate, search, and extract.
The Langchain model is used to develop applications powered by LLM. The langchain module aids in indexing the corpus and models the question-answer bot to feed on the data provided. It constructs answers for the provided questions.
The GPT index acts as an interface to connect LLM with the data. It creates an index.json file with vector embedding based on the uploaded document. It
works on in-context learning that is fed into our document.
GRADIO
[5]
Gradio allows you to create web-based UI applications. The UI in this project is
designed in such a way that it allows user input in the form of simple text, text files, and images. Gradio is used to get a quick proof of concept and to deploy, other methods can be used.
PROMPTING TECHNIQUE
For getting a precise response to user queries, one should follow Prompting
Technique. Two important rules of Prompt Engineering are:
A. Write clear and Specific instructions:
a. Include delimiters in the right place. b. Ask for structured output.
c. Check whether conditions are satisfied. d. Ask Chatbot to perform a task.
B. Let the model think:
a. Specify the steps to complete a task.
b. Instruct the model to work out it before rushing to a conclusion.
OBJECTS OF THE INVENTION
The objects of the invention on "Intelligent Analysis of Document-Driven QA Chatbot" aim to develop an advanced document-driven question-answering chatbot that effectively retrieves, understands, and responds to user queries in a conversational manner. The primary objectives of the invention include:
1. Enhance document retrieval: Improve the document retrieval process by incorporating advanced natural language understanding and information retrieval techniques to identify and rank the most relevant documents for a
given user query.
[6]
2. Improve question-answering capabilities: Leverage state-of-the-art deep
learning models and architectures to accurately extract and generate context- aware answers from the retrieved documents, ensuring the chatbot provides precise and informative responses.
3. Contextualize user interactions: Develop a conversational AI system that understands and maintains context across multiple turns of dialogue, allowing the chatbot to engage in more coherent and natural conversations with users.
4. Adapt to various domains: Design a chatbot that can be easily customized and adapted to different domains, industries, and applications, ensuring its effectiveness and applicability in a wide range of contexts.
5. Optimize user experience: Create a user-friendly interface and seamless interaction experience by incorporating natural language generation techniques that produce fluent, grammatically correct, and coherent responses.
6. Ensure scalability and efficiency: Design a chatbot system that can efficiently handle large volumes of documents and user queries without compromising the quality of the answers and response times.
7. Evaluate performance: Develop robust evaluation metrics and benchmarks to assess the chatbot's performance in terms of document retrieval, question answering, and overall user satisfaction, ensuring continuous improvement and refinement of the system.
8. Promote explainability and transparency: Implement mechanisms that allow the chatbot to provide explanations for its answers, enhancing user trust and
[7]
understanding of the system's decision-making processes.
BRIEF DESCRIPTION OF FIGURES
These and other features, aspects, and advantages of the present invention will become better understood when the following detailed description is read with reference to the accompanying figures in which like characters represent like parts throughout the figures, wherein:
Figure 1 illustrates the framework of Intelligent Analysis of Document Driven
QA Chabot.
Figure 2 depicts the two ways to upload user documents via the User
Interface.
Figure 3 depicts the User Interface for Knowledge Bot Chat.
Furthermore, in terms of the construction of the device, one or more components of the device may have been represented in the figures by conventional symbols, and the figures may show only those specific details that are pertinent to understanding the embodiments of the present invention so as not to obscure the figures with details that will be readily apparent to those of ordinary skill in the art having benefit of the description herein.
DETAILED DESCRIPTION OF THE INVENTION
For the purpose of promoting an understanding of the principles of the invention, reference will now be made to the embodiment illustrated in the figures and specific language will be used to describe the same. It will nevertheless be understood that no limitation of the scope of the invention is thereby intended, such alterations and further modifications in the illustrated system, and such further applications of the principles of the invention as
illustrated therein being contemplated as would normally occur to one skilled in
[8]
the art to which the invention relates.
It will be understood by those skilled in the art that the foregoing general description and the following detailed description are exemplary and explanatory of the invention and are not intended to be restrictive thereof.
Reference throughout this specification to “an aspect”, “another aspect” or similar language means that a particular feature, structure, or characteristic described in connection with the embodiment is included in at least one embodiment of the present invention. Thus, appearances of the phrase “in an embodiment”, “in another embodiment” and similar language throughout this specification may, but do not necessarily, all refer to the same embodiment.
The terms "comprises", "comprising", or any other variations thereof, are intended to cover a non-exclusive inclusion, such that a process or method that comprises a list of steps does not include only those steps but may include other steps not expressly listed or inherent to such process or method. Similarly, one or more devices or systems or elements or structures or components proceeded by "comprises... a" does not, without more constraints, preclude the existence of other devices or other systems or other elements or other structures or other components or additional devices or additional systems or additional elements or additional structures or additional components.
Unless otherwise defined, all technical and scientific terms used herein have the same meaning as commonly understood by one of ordinary skill in the art to which this invention belongs. The system, methods, and examples provided herein are illustrative only and not intended to be limiting.
The terms “a” and “an” herein do not denote a limitation of quantity, but rather
[9]
denote the presence of at least one of the referenced items.
The terms “having”, “comprising”, “including”, and variations thereof signify
the presence of a component.
The present embodiment describes the Analysis of Document Driven QA Chabot gets input from the user. The UI is made in such a way that it can get simple text as input from the text box (100), text documents (101), and even images (102). After data collection, here comes the data preprocessing (103). The text derived from the text box and the text document doesn’t require external preprocessing since the simple vector model does it for us. The image that is given as input contains text data (104). The image is preprocessed using a computer vision module called cv2. The image is converted into its grayscale and the noise is removed. Contours are added and the font is dilated. The text from the image is extracted using a module called pytesseract. From now on the process is the same for all three types of files. The text being extracted is provided as input to a module called GPTSimpleVectorIndex (105). This module performs vector embedding to the raw text. The text is converted into an index file where the file contains key-value pairs (106). The bot is built on this indexed file such that when a user queries a question (107), it chooses a relevant value among all the key-value pairs and returns the answer (108).
FEATURES
1. Login System
The user needs to log in with the username and password to access the application.
2. Get Documents from the user
The application gets the document as input from the user. The supported document types are .tx[t1,0.pdf, and also image files in the format of .png, .jpeg, and .jpg. It also allows users to give text directly into the text box.
3. Extract text data
To train the model, we need text data. The uploaded document goes through a series of processing and the text is extracted. For image files, the text is extracted through the concept of Optical Character Recognition. Each page in the pdf is converted to image data and text is extracted through OCR.
4. Build the Bot
The text extracted from the document is fed as input to the chatbot. This is a specialized feature of this Document Analysis. It facilitates the usage of a knowledge bot that answers the prompts of the user. This is exactly a chatterbot with advanced features.
5. Dealing with chat history
The advanced features of this chatbot include downloading the chat history in the form of a .txt file. The user can also clear the chat history whenever required.
EXPERIMENTAL DEMOS
i. This User Interface is used to Upload user documents with two different options as represented in fig. 2. One with Normal Text input and Custom file type like image, Text Document, or PDF file. Once a user uploads their file, they can view the content of the document and train the LLM model by clicking “Build the Bot”. The text box below “Verify the content” displays the actual content presented in the document given.
ii. User Interface to Chat with Knowle[1d1ge Bot, where Knowledge bot is built upon the given document as represented in fig. 3. Users can use the “AskMe”
textbox to have a conversation with Knowledge Bot to get a solution to their questions from the Document. By Giving a clear prompt one can get a more accurate response from Knowledge Bot.
ADVANTAGES OF INTELLIGENT DOCUMENT ANALYSIS
It reduces the risk of human errors in interpreting and analyzing documents. Each document in an organization is important and analyzing them properly is an inevitable portion of business intelligence and analysis. Reduces the latency time of search because it focuses only on the Region of Interest in the document.
1. Question And Answering: AskMe feature allow users to clarify all the queries related to the provided document.
2. Sentiment Analysis: It is used to identify the type of sentiment like positive or negative.
3. Verification: Check if specific text such as Person, place, or anything related to specific use cases is present inside the document.
4. Summarization: It is also used to summarize the whole document with at most a given number of words, sentences, or paragraphs
IMPORTANCE
This is a generalized project and when improved can be applied to various domains in predicting the future outcomes related to the document. As analysis consumes most of the process in data science and data analysis, this will help improvise the performance metrics and operational efficiency of the data analysis on offering business solutions.
COST OF DESIGN AND IMPLEMENTATION
In the proposed model:
Server to store user database - Rs 4560 per month
Deployment using Hugging Face Spaces
with GPU - Rs 34560 per month
---------------------------------------------------------------------------------
Total cost , Claims:CLAIMS
We claim,
1. The Intelligent Analysis of Document Driven QA Chabot comprises
i. The UI is made in such a way that it can get simple text as input from the text box (100), text documents (101), and even images (102),
ii. After data collection, here comes the data preprocessing (103).
The text derived from the text box and the text document doesn’t require external preprocessing since the simple vector model does it for us.,
iii. The text being extracted is provided as input to a module called
GPTSimpleVectorIndex (105),
iv. The text is converted into an index file where the file contains key-value pairs (106),
v. The bot is built on this indexed file such that when a user queries a question (107), it chooses a relevant value among all the key- value pairs and returns the answer (108).
2. The Intelligent Analysis of Document Driven QA Chabot as disclosed in claim 1, this model is utilitarian to many business needs. We accomplished the implementation of this project using the above- mentioned methodologies.
3. The Intelligent Analysis of Document Driven QA Chabot as disclosed in claim 1, the implementation works quite well and the project can extract text from various file inputs and build a bot on the text file to satisfy user queries.
4. The Intelligent Analysis of Document Driven QA Chabot as disclosed in claim 1, when this project comes into existence, it will be a boon to many business organizations where data plays a major role.
5. The Intelligent Analysis of Document Driven QA Chabot as disclosed in claim 1, our proposed method can potentially reduce healthcare costs by preventing expensive hospitalizations and interventions, and optimizing healthcare resources allocation. It also provides a valuable tool for healthcare providers in population health management and public health surveillance.
| # | Name | Date |
|---|---|---|
| 1 | 202341032924-STATEMENT OF UNDERTAKING (FORM 3) [10-05-2023(online)].pdf | 2023-05-10 |
| 2 | 202341032924-REQUEST FOR EARLY PUBLICATION(FORM-9) [10-05-2023(online)].pdf | 2023-05-10 |
| 3 | 202341032924-FORM-9 [10-05-2023(online)].pdf | 2023-05-10 |
| 4 | 202341032924-FORM 1 [10-05-2023(online)].pdf | 2023-05-10 |
| 5 | 202341032924-DRAWINGS [10-05-2023(online)].pdf | 2023-05-10 |
| 6 | 202341032924-COMPLETE SPECIFICATION [10-05-2023(online)].pdf | 2023-05-10 |