Abstract: TRANSFORMER-BASED HANDWRITING RECOGNITION SYSTEM ABSTRACT A transformer-based handwriting recognition system (100) is disclosed. The system (100) comprising: an input unit (102) adapted to upload images and a processor (104). The processor (104) is configured to: receive the uploaded images from the input unit (102); check a presence of handwritten text in the received images; engage a Quantum Convolutional Neural Network (QCNN) (106) to extract local handwriting features from the received images; activate a transformer-based encoder (108) to capture global contextual understanding of the handwritten text in the received images; employ a feature fusion layer (110) to integrate the local handwriting features and the global contextual understanding to initiate a recognition of the handwritten text; and execute a decoding mechanism (112) to map and convert the recognized handwritten text into a machine-readable text format. The system (100) employs self-supervised learning, allowing it to adapt to diverse handwriting styles without large, annotated dataset. Claims: 10, Figures: 2 Figure 1 is selected.
Description:BACKGROUND
Field of Invention
[001] Embodiments of the present invention generally relate to a handwriting recognition system and particularly to a transformer-based handwriting recognition system.
Description of Related Art
[002] Handwritten text recognition is a crucial area in automated document processing, playing a significant role in industries such as banking, healthcare, legal documentation, and government record-keeping. Traditional Optical Character Recognition (OCR) methods have demonstrated limited success in extracting handwritten text due to the variations in writing styles, ink quality, paper texture, and image distortions. Handwriting is inherently less structured than printed text, making it difficult for conventional OCR techniques to achieve high accuracy. The problem is further exacerbated when processing historical documents, signatures, or handwritten forms, where faded ink, overlapping characters, and background noise reduce recognition efficiency.
[003] Existing approaches for handwritten text extraction rely heavily on deep learning models, particularly Convolutional Neural Networks (CNNs) and Recurrent Neural Networks (RNNs), which have improved accuracy compared to classical OCR techniques. However, these models often struggle with large-scale datasets containing diverse handwriting samples and different languages. Additionally, they require extensive labeled data for training, making them less adaptable to real-world scenarios where labeled handwritten datasets are scarce. Despite advancements in neural networks, traditional methods fail to generalize well across different handwriting styles, leading to inconsistent results in complex documents.
[004] Recent developments in artificial intelligence have explored hybrid approaches that combine convolutional networks with attention-based mechanisms such as transformers. These approaches attempt to capture both local and global context for more robust recognition. Nevertheless, challenges persist in handling noise, variable text sizes, and complex backgrounds in handwritten documents. As industries seek more reliable solutions for handwritten text recognition, there remains a strong need for models that can efficiently extract text with high accuracy while being adaptable to a wide range of handwriting variations.
[005] There is thus a need for an improved and advanced transformer-based handwriting recognition system that can administer the aforementioned limitations in a more efficient manner.
SUMMARY
[006] Embodiments in accordance with the present invention provide a transformer-based handwriting recognition system. The system comprising an input unit adapted to upload images. The system further comprising a processor communicatively connected to the input unit. The processor is configured to receive the uploaded images from the input unit; check a presence of handwritten text in the received images; engage a Quantum Convolutional Neural Network (QCNN) to extract local handwriting features from the received images; activate a transformer-based encoder to capture global contextual understanding of the handwritten text in the received images; employ a feature fusion layer to integrate the local handwriting features and the global contextual understanding to initiate a recognition of the handwritten text; and execute a decoding mechanism to map and convert the recognized handwritten text into a machine-readable text format.
[007] Embodiments in accordance with the present invention further provide a method for transformer-based handwriting recognition. The method comprising steps of receiving uploaded images from an input unit; checking a presence of handwritten text in the received images; engaging a Quantum Convolutional Neural Network (QCNN) to extract local handwriting features from the received images; activating a transformer-based encoder to capture global contextual understanding of the handwritten text in the received images; employing a feature fusion layer to integrate the local handwriting features and the global contextual understanding to initiate a recognition of the handwritten text; and executing a decoding mechanism to map and convert the recognized handwritten text into a machine-readable text format.
[008] Embodiments of the present invention may provide a number of advantages depending on their particular configuration. First, embodiments of the present application may provide a transformer-based handwriting recognition system.
[009] Next, embodiments of the present application may provide a handwriting recognition system that achieves 96% accuracy, surpassing traditional OCR and deep learning models.
[0010] Next, embodiments of the present application may provide a handwriting recognition system that employs self-supervised learning that allow the system to adapt diverse handwriting styles without large annotated datasets.
[0011] Next, embodiments of the present application may provide a handwriting recognition system that effectively handles image distortions, background noise, and low-quality scans, which are common challenges in handwritten text extraction.
[0012] Next, embodiments of the present application may provide a handwriting recognition system that utilizes Quantum Convolutional Neural Networks (QCNNs) for extracting local features and transformers for capturing global context, leading to a more comprehensive feature representation.
[0013] Next, embodiments of the present application may provide a handwriting recognition system that is highly useful in finance, healthcare, legal, and government sectors for automating document processing in critical applications.
[0014] These and other advantages will be apparent from the present application of the embodiments described herein.
[0015] The preceding is a simplified summary to provide an understanding of some embodiments of the present invention. This summary is neither an extensive nor exhaustive overview of the present invention and its various embodiments. The summary presents selected concepts of the embodiments of the present invention in a simplified form as an introduction to the more detailed description presented below. As will be appreciated, other embodiments of the present invention are possible utilizing, alone or in combination, one or more of the features set forth above or described in detail below.
BRIEF DESCRIPTION OF THE DRAWINGS
[0016] The above and still further features and advantages of embodiments of the present invention will become apparent upon consideration of the following detailed description of embodiments thereof, especially when taken in conjunction with the accompanying drawings, and wherein:
[0017] FIG. 1 illustrates a block diagram of a transformer-based handwriting recognition system, according to an embodiment of the present invention; and
[0018] FIG. 2 depicts a flowchart of a method for transformer-based handwriting recognition, according to an embodiment of the present invention.
[0019] The headings used herein are for organizational purposes only and are not meant to be used to limit the scope of the description or the claims. As used throughout this application, the word "may" is used in a permissive sense (i.e., meaning having the potential to), rather than the mandatory sense (i.e., meaning must). Similarly, the words “include”, “including”, and “includes” mean including but not limited to. To facilitate understanding, like reference numerals have been used, where possible, to designate like elements common to the figures. Optional portions of the figures may be illustrated using dashed or dotted lines, unless the context of usage indicates otherwise.
DETAILED DESCRIPTION
[0020] The following description includes the preferred best mode of one embodiment of the present invention. It will be clear from this description of the invention that the invention is not limited to these illustrated embodiments but that the invention also includes a variety of modifications and embodiments thereto. Therefore, the present description should be seen as illustrative and not limiting. While the invention is susceptible to various modifications and alternative constructions, it should be understood, that there is no intention to limit the invention to the specific form disclosed, but, on the contrary, the invention is to cover all modifications, alternative constructions, and equivalents falling within the scope of the invention as defined in the claims.
[0021] In any embodiment described herein, the open-ended terms "comprising", "comprises”, and the like (which are synonymous with "including", "having” and "characterized by") may be replaced by the respective partially closed phrases "consisting essentially of", “consists essentially of", and the like or the respective closed phrases "consisting of", "consists of”, the like.
[0022] As used herein, the singular forms “a”, “an”, and “the” designate both the singular and the plural, unless expressly stated to designate the singular only.
[0023] FIG. 1 illustrates a block diagram of a transformer-based handwriting recognition system 100 (hereinafter referred to as the system 100), according to an embodiment of the present invention. The system 100 may be adapted to identify a handwritten text in an uploaded digital file. Further, the system 100 may be adapted to recognize a text in the handwritten text and convert the same into machine readable text. The digital file may be, but not limited to, a document file, a presentation file, a video file, and so forth. In a preferred embodiment of the present invention, the digital file may be an image file. Embodiments of the present invention are intended to include or otherwise cover any type of the digital file that may be uploaded to the system 100, including known, related art, and/or later developed technologies.
[0024] The system 100 may comprise an input unit 102, a processor 104, a Quantum Convolutional Neural Network (QCNN) 106, a transformer-based encoder 108, a feature fusion layer 110, and a decoding mechanism 112.
[0025] In an embodiment of the present invention, the input unit 102 may be adapted to upload the image file(s) to the system 100. The image file(s) may comprise the handwritten text that needs to be processed for recognition. The input unit 102 may be configured to support various file formats, including but not limited to, JPEG, PNG, BMP, TIFF, and PDF, ensuring compatibility with a wide range of digital documents. The input unit 102 may facilitate image acquisition from multiple sources (not shown), such as direct uploads from a local storage device, scanned documents from optical scanners, or real-time image capture from cameras and mobile devices.
[0026] Furthermore, the input unit 102 may be configured to preprocess the uploaded image file(s) before further processing. The preprocessing of the uploaded image file(s) may include operations such as resizing, grayscale conversion, contrast enhancement, noise reduction, background normalization, and so forth to improve the quality of handwritten text extraction. The input unit 102 may also support adaptive thresholding techniques to refine the text regions to ensure optimal feature extraction in subsequent stages. The input unit 102 may be, but not limited to, a mobile, a computer, and so forth. Embodiments of the present invention are intended to include or otherwise cover any type of the input unit 102, including known, related art, and/or later developed technologies.
[0027] In an embodiment of the present invention, the processor 104 communicatively connected to the input unit 102. The processor 104 may be configured to receive the uploaded images from the input unit 102. The processor 104 may be configured to check a presence of the handwritten text in the received images.
[0028] In an embodiment of the present invention, the processor 104 may be configured to engage the Quantum Convolutional Neural Network (QCNN) 106 to extract local handwriting features from the received images. In an embodiment of the present invention, the Quantum Convolutional Neural Network (QCNN) 106 may utilize quantum gates and quantum entanglement for enhanced local handwriting feature extraction and handwriting variability adaptation. The processor 104 may be configured to activate the transformer-based encoder 108 to capture global contextual understanding of the handwritten text in the received images. In an embodiment of the present invention, the transformer-based encoder 108 may processes a hierarchical structure of the handwritten text to preserve spatial relationships and improve recognition.
[0029] The processor 104 may be configured to employ the feature fusion layer 110 to integrate the local handwriting features and the global contextual understanding to initiate a recognition of the handwritten text. In an embodiment of the present invention, the feature fusion layer 110 may employ attention mechanisms to combine the local handwriting features and the global contextual understanding. The processor 104 may be configured to execute the decoding mechanism 112 to map and convert the recognized handwritten text into a machine-readable text format. In an embodiment of the present invention, the decoding mechanism 112 may implement sequence-to-sequence learning to generate accurate text output from the recognized handwritten text.
[0030] The processor 104 may be, but not limited to, a Programmable Logic Control (PLC) unit, a microprocessor, a development board, and so forth. Embodiments of the present invention are intended to include or otherwise cover any type of the processor 104, including known, related art, and/or later developed technologies.
[0031] In an exemplary embodiment of the present invention, the system 100 may be implemented for recognizing and digitizing handwritten doctor prescriptions. The input unit 102 may be adapted to receive image files of handwritten prescriptions captured from pharmacy scanners, mobile devices, or electronic health record (EHR) systems. The input unit 102 may support real-time image acquisition for allowing healthcare professionals to upload prescriptions directly for processing. Upon receiving the prescription image, the processor 104 may be configured to verify the presence of handwritten text and eliminate unnecessary artifacts such as stamps, signatures, or background noise. The Quantum Convolutional Neural Network (QCNN) 106 may extract key handwriting features, identifying medical terminologies, dosage instructions, and drug names. The transformer-based encoder 108 may then analyze the spatial relationships between words, ensuring accurate interpretation of medical abbreviations and prescription structures.
[0032] The feature fusion layer 110 may integrate the extracted handwriting features with contextual understanding, refining recognition accuracy for complex prescriptions. The decoding mechanism 112 may then convert the handwritten content into structured, machine-readable text, enabling pharmacies to process prescriptions digitally. The system 100 may further cross-check drug names against pharmaceutical databases to reduce errors and enhance patient safety.
[0033] In another exemplary embodiment of the present invention, the system 100 may be implemented for recognizing and digitizing old handwritten recipes. The input unit 102 may be adapted to receive image files of aged, handwritten recipe documents captured from scanned cookbooks, family journals, or mobile phone images. The input unit 102 may support image enhancement techniques to process faded ink, smudged handwriting, or deteriorated paper textures. Upon receiving the recipe image, the processor 104 may be configured to detect the presence of handwritten text while filtering out stains, creases, or background noise that may interfere with recognition. The Quantum Convolutional Neural Network (QCNN) 106 may extract handwriting features such as ingredient names, measurement units, and step-by-step instructions, even when written in varying styles or cursive script. The transformer-based encoder 108 may analyze contextual relationships between words, ensuring accurate interpretation of abbreviations, shorthand notations, and cooking-specific terminologies. The feature fusion layer 110 may integrate the extracted handwriting features with contextual understanding, refining recognition accuracy for handwritten measurements and formatting inconsistencies. The decoding mechanism 112 may then convert the handwritten content into structured, machine-readable text, allowing users to store, share, or print the digital version of the recipe.
[0034] It should be understood that the embodiments described above are merely exemplary implementations of the present invention. The system 100 may be used for numerous handwriting recognition applications beyond doctor prescriptions and old handwritten recipes. The present invention may be adapted for recognizing and digitizing handwritten notes, legal documents, historical manuscripts, academic papers, financial records, personal journals, and various other handwritten materials. The system 100 may be configured to support diverse industries, including healthcare, education, finance, law, and archival preservation.
[0035] FIG. 2 depicts a flowchart of a method 200 for transformer-based handwriting recognition, according to an embodiment of the present invention.
[0036] At step 202, the system 100 may receive the uploaded images from the input unit 102.
[0037] At step 204, the system 100 may check the presence of the handwritten text in the received images.
[0038] At step 206, the system 100 may engage the Quantum Convolutional Neural Network (QCNN) 106 to extract the local handwriting features from the received images.
[0039] At step 208, the system 100 may activate the transformer-based encoder 108 to capture the global contextual understanding of the handwritten text in the received images.
[0040] At step 210, the system 100 may employ the feature fusion layer 110 to integrate the local handwriting features and the global contextual understanding to initiate the recognition of the handwritten text.
[0041] At step 212, the system 100 may execute the decoding mechanism 112 to map and convert the recognized handwritten text into the machine-readable text format.
[0042] While the invention has been described in connection with what is presently considered to be the most practical and various embodiments, it is to be understood that the invention is not to be limited to the disclosed embodiments, but on the contrary, is intended to cover various modifications and equivalent arrangements included within the scope of the appended claims.
[0043] This written description uses examples to disclose the invention, including the best mode, and also to enable any person skilled in the art to practice the invention, including making and using any devices or systems and performing any incorporated methods. The patentable scope of the invention is defined in the claims, and may include other examples that occur to those skilled in the art. Such other examples are intended to be within the scope of the claims if they have structural elements that do not differ from the literal language of the claims, or if they include equivalent structural elements within substantial differences from the literal languages of the claims. , Claims:CLAIMS
I/We Claim:
1. A transformer-based handwriting recognition system (100), the system (100) comprising:
an input unit (102) adapted to upload images;
a processor (104) communicatively connected to the input unit (102), characterized in that the processor (104) is configured to:
receive the uploaded images from the input unit (102);
check a presence of handwritten text in the received images;
engage a Quantum Convolutional Neural Network (QCNN) (106) to extract local handwriting features from the received images;
activate a transformer-based encoder (108) to capture global contextual understanding of the handwritten text in the received images;
employ a feature fusion layer (110) to integrate the local handwriting features and the global contextual understanding to initiate a recognition of the handwritten text; and
execute a decoding mechanism (112) to map and convert the recognized handwritten text into a machine-readable text format.
2. The system (100) as claimed in claim 1, wherein the Quantum Convolutional Neural Network (QCNN) (106) utilizes quantum gates and quantum entanglement for enhanced local handwriting feature extraction and handwriting variability adaptation.
3. The system (100) as claimed in claim 1, wherein the transformer-based encoder (108) processes a hierarchical structure of the handwritten text to preserve spatial relationships and improve recognition.
4. The system (100) as claimed in claim 1, wherein the feature fusion layer (110) employs attention mechanisms to combine the local handwriting features and the global contextual understanding.
5. The system (100) as claimed in claim 1, wherein the decoding mechanism (112) implements sequence-to-sequence learning to generate accurate text output from the recognized handwritten text.
6. A method (200) for transformer-based handwriting recognition, the method (200) is characterized by steps of:
receiving uploaded images from an input unit (102);
checking a presence of handwritten text in the received images;
engaging a Quantum Convolutional Neural Network (QCNN) (106) to extract local handwriting features from the received images;
activating a transformer-based encoder (108) to capture global contextual understanding of the handwritten text in the received images;
employing a feature fusion layer (110) to integrate the local handwriting features and the global contextual understanding to initiate a recognition of the handwritten text; and
executing a decoding mechanism (112) to map and convert the recognized handwritten text into a machine-readable text format.
7. The method (200) as claimed in claim 6, wherein the Quantum Convolutional Neural Network (QCNN) (106) utilizes quantum gates and quantum entanglement for enhanced local handwriting feature extraction and handwriting variability adaptation.
8. The method (200) as claimed in claim 6, wherein the transformer-based encoder (108) processes a hierarchical structure of the handwritten text to preserve spatial relationships and improve recognition.
9. The method (200) as claimed in claim 6, wherein the feature fusion layer (110) employs attention mechanisms to combine the local handwriting features and the global contextual understanding.
10. The method (200) as claimed in claim 6, wherein the decoding mechanism (112) implements sequence-to-sequence learning to generate accurate text output from the recognized handwritten text.
Date: March 12, 2025
Place: Noida
Nainsi Rastogi
Patent Agent (IN/PA-2372)
Agent for the Applicant
| # | Name | Date |
|---|---|---|
| 1 | 202541022535-STATEMENT OF UNDERTAKING (FORM 3) [13-03-2025(online)].pdf | 2025-03-13 |
| 2 | 202541022535-REQUEST FOR EARLY PUBLICATION(FORM-9) [13-03-2025(online)].pdf | 2025-03-13 |
| 3 | 202541022535-POWER OF AUTHORITY [13-03-2025(online)].pdf | 2025-03-13 |
| 4 | 202541022535-OTHERS [13-03-2025(online)].pdf | 2025-03-13 |
| 5 | 202541022535-FORM-9 [13-03-2025(online)].pdf | 2025-03-13 |
| 6 | 202541022535-FORM FOR SMALL ENTITY(FORM-28) [13-03-2025(online)].pdf | 2025-03-13 |
| 7 | 202541022535-FORM 1 [13-03-2025(online)].pdf | 2025-03-13 |
| 8 | 202541022535-EVIDENCE FOR REGISTRATION UNDER SSI(FORM-28) [13-03-2025(online)].pdf | 2025-03-13 |
| 9 | 202541022535-EDUCATIONAL INSTITUTION(S) [13-03-2025(online)].pdf | 2025-03-13 |
| 10 | 202541022535-DRAWINGS [13-03-2025(online)].pdf | 2025-03-13 |
| 11 | 202541022535-DECLARATION OF INVENTORSHIP (FORM 5) [13-03-2025(online)].pdf | 2025-03-13 |
| 12 | 202541022535-COMPLETE SPECIFICATION [13-03-2025(online)].pdf | 2025-03-13 |
| 13 | 202541022535-Proof of Right [21-05-2025(online)].pdf | 2025-05-21 |