Abstract: SYSTEM AND METHOD FOR OPTICAL CHARACTER READING FOR AUTOMATED LPG CYLINDER FILLING OPERATION ABSTRACT A system for optical character reading for an automated LPG cylinder filling operation is provided. The system includes an image acquisition module to receive and select an image with identification marks on an LPG cylinder. A image processing module is configured to assign one or more labels corresponding to the one or more identification marks, segment one or more characters present in the one or more identification marks and lastly extract one or more segmented characters present in the one or more identification marks via an image classification technique. Recognition or classification technique includes a UNET based classification model or a rotational region convolution neural network (R2CNN) model. A character reading module is configured to concatenate the one or more characters extracted by the character extraction module to generate one or more readable identification marks. FIG. 1
Claims:WE CLAIM:
1. A system (10) for optical character reading for an automated LPG cylinder filling operation, the system comprising:
a fisheye lens (60) positioned at a predefined height with respect to an LPG cylinder, and configured to capture a plurality of image frames (70) corresponding to the LPG cylinder;
an image acquisition module (20) operable by one or more processors, wherein the image acquisition module (20) is configured to
receive the plurality of captured image frames (70); and
select one or more image frames (70) from the plurality of captured image frames comprising one or more identification marks, wherein the one or more identification marks comprises at least one of tare weight of the LPG cylinder, DPT code of the LPG cylinder or a combination thereof;
an image processing module (30) operable by the one or more processors and operatively coupled to the image acquisition module (20), wherein the image processing module (30) is configured to:
assign one or more labels corresponding to the one or more identification marks in relation to each of one or more selected image frames (70), wherein each of the one or more selected image frames (70) is assigned different label values according to distinct image frame (70) pixels;
segment one or more characters present in the one or more identification marks corresponding to each of one or more assigned labels, wherein each of the one or more identification marks is segmented using one or more boundaries by implementation of contouring technique; and
extract one or more segmented characters present in the one or more identification marks via an image classification technique;
a character reading module (40) operable by the one or more processors and operatively coupled to the image processing module (30), wherein the character reading module (40) is configured to concatenate one or more extracted characters to generate one or more readable identification marks; and
a character display module (50) operable by the one or more processors and operatively coupled to the character reading module (40), wherein the character display module (50) is configured to display the one or more readable identification marks in a readable presentation for a user interpretation, wherein the readable presentation comprises the LPG cylinder inspection results, the LPG cylinder inspection details and the LPG cylinder inspection statistics.
2. The system (10) as claimed in claim 1, wherein the one or more processors is hosted on a server.
3. The system (10) as claimed in claim 2, wherein the plurality of image frames (70) is saved on the server with metadata associated with the image (70).
4. The system (10) as claimed in claim 1, wherein the image classification technique comprises a UNET based classification model, and a rotational region convolution neural network (R2CNN) model.
5. The system (10) as claimed in claim 4, wherein the UNET based classification model is configured to:
extract the one or more segmented characters as a patch with a relative position of the corresponding one or more characters in each of the plurality of received image frames; and
classify the one or more characters upon extraction to generate the one or more readable identification marks based on a trained ResNet model.
6. The system (10) as claimed in claim 4, wherein the rotational region convolution neural network (R2CNN) model is configured to:
generate one or more axis aligned boxes around the corresponding one or more identification marks using region proposal network;
extract a plurality of features from each of the one or more received image frames;
train the plurality of features with the one or more axis aligned and one or more inclined boxes using fast region convolution neural network (RCNN) to obtain a plurality of parameters;
apply an inclined non-max suppression technique to filter the one or more inclined boxes based on a predefined threshold value; and
obtain the one or more readable identification marks based on a location and orientation generated from the one or more inclined boxes filtered.
7. The system (10) as claimed in claim 1, wherein the image processing module (30) is also configured to grade the image as a correctly read image or a wrongly read image, wherein the wrongly read image is fine-tuned based on a deep learning-based model.
8. A method (530) for optical character reading for an automated LPG cylinder filling operation, the method comprising:
receiving, by an image acquisition module, a plurality of captured image frames corresponding to an LPG cylinder (540);
selecting, by the image acquisition module, one or more image frames from the plurality of captured image frames comprising one or more identification marks (550);
assigning, by an image processing module, one or more labels corresponding to the one or more identification marks in relation to each of one or more selected image frames (560);
segmenting, by the image processing module, one or more characters present in the one or more identification marks corresponding to each of one or more assigned labels (570);
extracting, by the image processing module, one or more segmented characters present in the one or more identification marks via an image classification technique (580);
concatenating, by a character reading module, one or more extracted characters to generate one or more readable identification marks (590); and
displaying, by a character display module, the one or more readable identification marks in a readable presentation for a user interpretation (600).
9. The method (530) as claimed in claim 8, comprising grading, by image processing module, the image as a correctly read image or a wrongly read image, wherein the wrongly read image is fine-tuned based on a deep learning-based model.
10. The method (530) as claimed in claim 8, wherein selecting, by the image acquisition module, based on the one or more identification marks comprising at least one of tare weight of the LPG cylinder, DPT code of the LPG cylinder or a combination thereof.
11. The method (530) as claimed in claim 8, wherein assigning, by the image processing module, different label values according to distinct image frame pixels to each of one or more selected image frames.
12. The method (530) as claimed in claim 8, wherein segmenting, by the image processing module, the one or more characters present in the one or more identification marks using one or more boundaries by implementation of contouring technique.
13. The method (530) as claimed in claim 8, wherein displaying, by the character display module, the readable presentation comprising the LPG cylinder inspection results, the LPG cylinder inspection details and the LPG cylinder inspection statistics.
14. The method (530) as claimed in claim 8, wherein extracting, by the image processing module, via an image classification technique comprises a UNET based classification model, and a rotational region convolution neural network (R2CNN) model.
Dated this 17th day of September 2020
Signature
Vidya Bhaskar Singh Nandiyal
Patent Agent (IN/PA-2912)
Agent for the Applicant
, Description:FIELD OF INVENTION
[0001] Embodiments of a present disclosure relates to image processing, and more particularly to a system and a method for optical character reading for an automated LPG cylinder filling operation.
BACKGROUND
[0002] Extracting data from images and building meaningful information from the extracted data is a complex and time-consuming task as a number of different texts, numbers, and symbols are essentially required to be identified and correlated. Typically, such data extraction and information building are done manually and are prone to human errors. More recently, technology-based systems have been employed to capture digital images and extract information and take decisions from captured images. One such application is with extracting textual information using optical character recognition (OCR) techniques from the digital images.
[0003] Existing OCR techniques have been built on predefined symbols, numbers, and text on which they have been trained. These predefined patterns and images serve as ‘training data’ which is referred during the OCR process. Of late, Machine Learning and Deep Learning based approaches have been developed which require a large amount of training data to perform the given OCR operation. However, as the digital images and the training data available in the digital images are very limited, training a machine learning model for OCR to identify the data with high level of accuracy is challenging. Further, once an OCR technique has been trained for or has learnt a set of symbols, it is difficult to apply it to a new set of images which may be similar to the previous set but yet may have many new symbols that the OCR technique may not recognize. Additionally, there are many situations when the data in the digital images may vary due to multiple factors.
[0004] For example, data printed on certain objects is difficult to read some time such as data printed on LPG containers. Hence, the digital images are highly inconsistent and depend on various factors such as image resolution, noise effect, font size, and type variation, and so forth for such data on the LPG container. The data printed on LPG containers mainly states tare weight of the cylinder and DPT code. Tare weight is required to understand the amount (weight) of LPG required. And DPT Code is the date by when the cylinder needs to be recertified to ensure safety standards.
[0005] Moreover, in the digital images, the information is split into the various places and needs to be associated correctly. However, existing OCR techniques are therefore not able to perform with the good accuracy on the multiple digital images. Further, pre-defined OCR techniques are not only ineffective but also may be erroneous.
[0006] Hence, there is a need for an improved optical character reading system for an automated LPG cylinder filling operation and a method to operate the same and therefore address the aforementioned issues.
BRIEF DESCRIPTION
[0007] In accordance with one embodiment of the disclosure, a system for optical character reading for an automated LPG cylinder filling operation is disclosed. The system includes a fisheye lens. The fisheye lens is positioned at a predefined height with respect to an LPG cylinder. The fisheye lens is configured to capture a plurality of image frames corresponding to the LPG cylinder.
[0008] The system also includes one or more processors. The system also includes an image acquisition module. The image acquisition module is operable by the one or more processors. The image acquisition module is configured to receive the plurality of captured image frames. The image acquisition module is also configured to select one or more image frames from the plurality of captured image frames comprising one or more identification marks.
[0009] The system also includes an image processing module operable by the processors. The image processing module is operatively coupled to the image acquisition module. The image processing module is configured to assign one or more labels corresponding to the one or more identification marks in relation to each of one or more selected image frames. The image processing module is also configured to segment one or more characters present in the one or more identification marks corresponding to each of one or more assigned labels. The image processing module is also configured to extract one or more segmented characters present in the one or more identification marks via an image classification technique.
[0010] The system also includes a character reading module operable by the one or more processors. The character reading module is operatively coupled to the image processing module. The character reading module is configured to concatenate one or more extracted characters to generate one or more readable identification marks. The system also includes a character display module operable by the one or more processors. The character display module is operatively coupled to the character reading module. The character display module is configured to display the one or more readable identification marks in a readable presentation for a user interpretation.
[0011] In accordance with one embodiment of the disclosure, a method for optical character reading for an automated LPG cylinder filling operation is disclosed. The method includes receiving a plurality of captured image frames corresponding to an LPG cylinder. The method also includes selecting one or more image frames from the plurality of captured image frames comprising one or more identification marks. The method also includes assigning one or more labels corresponding to the one or more identification marks in relation to each of one or more selected image frames. The method also includes segmenting one or more characters present in the one or more identification marks corresponding to each of one or more assigned labels.
[0012] The method also includes extracting one or more segmented characters present in the one or more identification marks via an image classification technique. The method also includes concatenating one or more extracted characters to generate one or more readable identification marks. The method also includes displaying the one or more readable identification marks in a readable presentation for a user interpretation.
[0013] To further clarify the advantages and features of the present disclosure, a more particular description of the disclosure will follow by reference to specific embodiments thereof, which are illustrated in the appended figures. It is to be appreciated that these figures depict only typical embodiments of the disclosure and are therefore not to be considered limiting in scope. The disclosure will be described and explained with additional specificity and detail with the appended figures.
BRIEF DESCRIPTION OF THE DRAWINGS
[0014] The disclosure will be described and explained with additional specificity and detail with the accompanying figures in which:
[0015] FIG. 1 is a block diagram representation of a system for optical character reading for an automated LPG cylinder filling operation in accordance with an embodiment of the present disclosure;
[0016] FIG. 2 is a block diagram representation of one embodiment of the system of FIG. 1, depicting a training architecture corresponding to image processing module in accordance with an embodiment of the present disclosure;
[0017] FIG. 3 is a block diagram representation of one embodiment of the system of FIG. 1, depicting an inferencing architecture corresponding to character reading module in accordance with an embodiment of the present disclosure;
[0018] FIG. 4 is a schematic representation of one embodiment of the system of FIG. 1 depicting the display section corresponding to character display module in accordance with an embodiment of the present disclosure;
[0019] FIG. 5 is a block diagram representation of one embodiment of the system of FIG. 1 depicting the grading architecture in accordance with an embodiment of the present disclosure;
[0020] FIG. 6 is a schematic representation of an exemplary system for optical character reading of printed characters of FIG. 1 in accordance with an embodiment of the present disclosure;
[0021] FIG. 7 illustrates a schematic representation of an exemplary working of a Deep Learning based classification model using a UNET architecture of FIG. 6 in accordance with an embodiment of the present disclosure;
[0022] FIG. 8 illustrates a schematic representation of an exemplary working of a Deep Learning based model using the Rotational Region Convolution Neural Network (R2CNN) model of FIG. 6 in accordance with an embodiment of the present disclosure;
[0023] FIG. 9 is a computer or a server for the system in accordance with an embodiment of the present disclosure; and
[0024] FIG. 10 is a flow chart representing the steps involved in a method for optical character reading for an automatic LPG cylinder filling operation in accordance with an embodiment of the present disclosure.
[0025] Further, those skilled in the art will appreciate that elements in the figures are illustrated for simplicity and may not have necessarily been drawn to scale. Furthermore, in terms of the construction of the device, one or more components of the device may have been represented in the figures by conventional symbols, and the figures may show only those specific details that are pertinent to understanding the embodiments of the present disclosure so as not to obscure the figures with details that will be readily apparent to those skilled in the art having the benefit of the description herein.
DETAILED DESCRIPTION
[0026] For the purpose of promoting an understanding of the principles of the disclosure, reference will now be made to the embodiment illustrated in the figures and specific language will be used to describe them. It will nevertheless be understood that no limitation of the scope of the disclosure is thereby intended. Such alterations and further modifications in the illustrated online platform, and such further applications of the principles of the disclosure as would normally occur to those skilled in the art are to be construed as being within the scope of the present disclosure.
[0027] The terms "comprises", "comprising", or any other variations thereof, are intended to cover a non-exclusive inclusion, such that a process or method that comprises a list of steps does not include only those steps but may include other steps not expressly listed or inherent to such a process or method. Similarly, one or more devices or subsystems or elements or structures or components preceded by "comprises... a" does not, without more constraints, preclude the existence of other devices, subsystems, elements, structures, components, additional devices, additional subsystems, additional elements, additional structures or additional components. Appearances of the phrase "in an embodiment", "in another embodiment" and similar language throughout this specification may, but not necessarily do, all refer to the same embodiment.
[0028] Unless otherwise defined, all technical and scientific terms used herein have the same meaning as commonly understood by those skilled in the art to which this disclosure belongs. The system, methods, and examples provided herein are only illustrative and not intended to be limiting.
[0029] In the following specification and the claims, reference will be made to a number of terms, which shall be defined to have the following meanings. The singular forms “a”, “an”, and “the” include plural references unless the context clearly dictates otherwise.
[0030] Embodiments of the present disclosure relates to a system and a method for reading printed information on an LPG cylinder using Deep Learning based optical character reading techniques. The system includes an image acquisition module configured to receive and select an image of one or more identification marks on the LPG cylinder. A fisheye lens is being used to capture the said LPG cylinder image.
[0031] The system also includes an image processing module operatively coupled to the image acquisition module. The image processing module is configured to assign one or more labels corresponding to the one or more identification marks in relation to each of one or more selected image frames. The image processing module is also configured to segment one or more characters present in the one or more identification marks corresponding to each of one or more assigned labels. The image processing module is also configured to extract one or more segmented characters present in the one or more identification marks via an image classification technique.
[0032] A computer system (standalone, client or server computer system) configured by an application may constitute a “platform” or “module” that is configured and operated to perform certain operations. In one embodiment, the “platform” or “module” may be implemented mechanically or electronically, so a platform or module may comprise dedicated circuitry or logic that is permanently configured (within a special-purpose processor) to perform certain operations. In another embodiment, a “platform” or “module” may also comprise programmable logic or circuitry (as encompassed within a general-purpose processor or other programmable processor) that is temporarily configured by software to perform certain operations.
[0033] Accordingly, the term “module” or “platform” should be understood to encompass a tangible entity, be that an entity that is physically constructed permanently configured (hardwired) or temporarily configured (programmed) to operate in a certain manner and/or to perform certain operations described herein.
[0034] FIG. 1 is a block diagram representation of a system (10) for optical character reading for an automated LPG cylinder filling operation in accordance with an embodiment of the present disclosure. The system (10) includes a fisheye lens (60). The fisheye lens (60) is positioned at a predefined height with respect to an LPG cylinder. The fisheye lens (60) is configured to capture a plurality of image frames (30) corresponding to the LPG cylinder. The fisheye lens (60) mainly focuses to capture the print on the stay plate as well as the shoulder of the LPG cylinder.
[0035] The system (20) also includes one or more processors. In one embodiment, the one or more processors is hosted on a server (80). In such an embodiment, the server may be a cloud server or local server of a computing device. In such an embodiment, the computing device may include a computer, a tablet, a laptop, a mobile phone or the like. It is pertinent to note that the plurality of image frames (70) as captured by the fisheye camera lens (60) is saved on the server (80) with metadata associated with the image.
[0036] The system (10) also includes an image acquisition module (20). The image acquisition module (20) is operable by one or more processors. The image acquisition module (20) is configured to receive the plurality of captured image frames (70). The image acquisition module (20) is also configured to select one or more image frames (70) from the plurality of captured image frames (70) comprising one or more identification marks. In one embodiment, the one or more identification marks comprises at least one of tare weight of the LPG cylinder, DPT code of the LPG cylinder or a combination thereof.
[0037] In one such exemplary embodiment, the image (70) of the one or more identification marks corresponding to a chosen LPG cylinder may be obtained from a storage medium which stores the image (70) of the one or more identification marks. In another such exemplary embodiment, the image acquisition module (20) as stated first receives multiple image frames (70) from the image capturing devices and then selects image (70) from the multiple of captured images (70) comprising one or more identification marks.
[0038] Furthermore, the system (10) also includes an image processing module (30) operable by the one or more processors. The image processing module (30) is operatively coupled to the image acquisition module (20). The image processing module (30) is configured to assign one or more labels corresponding to the one or more identification marks in relation to each of one or more selected image frames (70). In one embodiment, the connected components in the image (10) are uniquely labelled based on a predefined heuristic. The connected component in the image (70) is a set of pixels that form a connected group. The system (10) provides labelling to the distinct pixels with corresponding label values.
[0039] The image processing module (30) is also configured to segment one or more characters present in the one or more identification marks corresponding to each of one or more assigned labels. In one embodiment, each of the one or more identification marks is segmented using one or more boundaries by implementation of contouring technique. For segmentation, the system (10) locates the identification marks on the image using the one or more labels and creates one or more boundaries around the corresponding identification marks.
[0040] The image processing module (30) is also configured to extract one or more segmented characters present in the one or more identification marks via an image classification technique. In one embodiment, the image classification technique includes many classification models, including, but not limited to, a UNET based classification model and a rotational region convolution neural network (R2CNN) model.
[0041] In one specific embodiment, the UNET based classification model extracts the one or more characters as a patch with a relative position of the corresponding one or more characters in the image (70). Here a post-processing technique is applied for extraction. Such one or more characters are again trained on the ResNet model to classify the one or more characters upon extraction to generate the one or more readable identification marks.
[0042] In another specific embodiment, the rotational region convolution neural network (R2CNN) model uses a region proposal network (RPN) to generate one or more axis aligned boxes around the one or more identification marks with orientation. The R2CNN model extracts the features from the image (70) with different pooled size for the regions proposed by the RPN. Further, the R2CNN model trains the features with the one or more axis aligned and one or more inclined boxes using fast region convolution neural network (RCNN) to obtain multiple parameters.
[0043] In such an embodiment, the multiple parameters may include axis aligned box coordinates, inclined box coordinates and corresponding box scores with predicted class for the individual characters. The R2CNN model further applies an inclined non-max suppression technique to filter the one or more inclined boxes based on a predefined threshold value. In continuation, the R2CNN model obtains the one or more readable identification marks based on a location and orientation generated from the one or more inclined boxes filtered upon applying post processing techniques.
[0044] The image processing module (30) is also configured to grade the image as a correctly read image or a wrongly read image. In such embodiment, the wrongly read image is fine-tuned based on a deep learning-based model.
[0045] The system (10) also includes a character reading module (40) operable by the one or more processors. The character reading module (40) is operatively coupled to the image processing module (30). The character reading module (40) is configured to concatenate one or more extracted characters to generate one or more readable identification marks. As used herein, the readable identification marks are in a human-readable format which is a representation of data or information that may be naturally read by humans. The one or more characters are concatenated to form a single word which may be representative of the identification mark.
[0046] The system (10) also includes a character display module (50) operable by the one or more processors. The character display module (50) is operatively coupled to the character reading module (40). The character display module (50) is configured to display the one or more readable identification marks in a readable presentation for a user interpretation. In one embodiment, the readable presentation comprises the LPG cylinder inspection results, the LPG cylinder inspection details and the LPG cylinder inspection statistics. In such embodiment, the concatenated readable words are updated automatically at specific sections for the user interpretations.
[0047] FIG. 2 is a block diagram representation of one embodiment of the system of FIG. 1, depicting a training architecture (90) corresponding to image processing module in accordance with an embodiment of the present disclosure. The input images of the LPG cylinder are uploaded from a server such as cloud storage or local storage and fed to the training pipeline (100). Further, the image processing module (30) segments one or more characters present in the one or more identification marks present on the LPG cylinder image based on the one or more assigned labels (110). In one embodiment, the image processing module (30) locates the identification marks on the image using the one or more labels (110) and creates one or more boundaries around the corresponding identification marks.
[0048] Specifically, the deep learning-based model is trained on various input images, where the images are labelled (110) with identification marks present on the LPG cylinder for example, tare weight and date code. Further, the system (90) trains segmentation and classification models to read the characters present on the object. Subsequently, the system (90) analyses the model performance in the evaluation (130) and uploads the deployable (140) solution to the server (80).
[0049] FIG. 3 is a block diagram representation of one embodiment of the system of FIG. 1, depicting an inferencing architecture (150) corresponding to character reading module in accordance with an embodiment of the present disclosure. The camera (160) and the deployed solution are integrated with an edge PC (180). The system (180) grabs (170) the images using camera (160) drivers and feed the grabbed images to an image processing solution (190), where the image processing solution (190) predicts the images with the deployed solution and provides tare weight and date code. Further, the images are saved (200) with the metadata in the edge PC (180) and from the edge PC (180), these images are uploaded to the cloud (80). It is pertinent to note that the metadata is later used for grading. The metadata contains tare weight and date code received from the inference.
[0050] FIG. 4 is a schematic representation of one embodiment of the system of FIG. 1 depicting the display section corresponding to character display module (50) in accordance with an embodiment of the present disclosure. The display section (50) clearly indicates the LPG cylinder inspection results, the LPG cylinder inspection details and the LPG cylinder inspection statistics.
[0051] FIG. 5 is a block diagram representation of one embodiment of the system (10) of FIG. 1 depicting the grading architecture (210) in accordance with an embodiment of the present disclosure. In one embodiment, the image processing module (30) pulls the image from the storage medium, wherein the storage medium stores the image captured by the image acquisition device. The images are stored with the metadata in the storage medium. The image processing module (30) analyses the images and determines the image as a correctly read image or a wrongly read image. In such embodiment, the image processing module (30) configured grading process takes out the wrongly read images, relabels them correctly and then finetunes/improves the existing solution. The wrongly read images are further analysed and inserted in a training loop for retraining and performance correction.
[0052] In detail, the image processing module (30) grades and retains the wrongly read images and deploy (300) the image on the server with improved accuracy. Such images are labelled (270) and retrained (280) to obtain the correctly read images which are deployed (300) on the server (80).
[0053] FIG. 6 is a schematic representation of an exemplary system for optical character reading of printed characters (10) of FIG. 1 in accordance with an embodiment of the present disclosure. Consider an example, where identification marks such as DPT code (320) and the tare weight (330) printed on the LPG container (310) are difficult to read. To read the DPT code (320) and the tare weight (330) printed on the LPG container (310), the system (10) includes a camera (60) with a fisheye lens to capture an image of LPG container (310) with the DPT code (320) and the tare weight (330) printed on the LPG container (310). The system (10) further includes the image processing module (30) in communication with the server (80), where the image processing module (30) identifies the DPT code (320) and the tare weight (330) on the image.
[0054] Furthermore, the image processing module (30) creates one or more boundaries around the corresponding DPT code (320) and the tare weight (330). Further, the one or more characters present in the DPT code (320) and the tare weight (320) are segmented using the one or more created boundaries using deep learning-based techniques, for example contouring technique. Considering that the characters of the DPT codes (320) are “B” “2” and “9” and the characters of the tare weight (330) are “1” “5” “.” and “9” as printed on the LPG container (310). Moreover, the image processing module (30) extracts one or more segmented characters present in the DPT code (320) and the tare weight (320) based on an image classification model. The image classification model includes a UNET based classification model or a rotational region convolution neural network (R2CNN) model. The classification of the characters of the DPT code and the tare weight using the UNET based classification model (350) is described in FIG. 7 and the classification of the characters of the DPT code and the tare weight using rotational region convolution neural network (R2CNN) model (420) is described in FIG. 8.
[0055] FIG. 7 illustrates a schematic representation of an exemplary working of a Deep Learning based classification model using a UNET architecture (350) of FIG. 6 in accordance with an embodiment of the present disclosure. The UNET based classification model (350) segments the characters present in the DPT code and the tare weight and send the characters to the pre-processing block where the pre-processing model (370) extracts the characters as a patch with a relative position of the corresponding characters in the DPT code and the tare weight. The output of the pre-processing block is individual characters upon extraction. Further, such individual characters are again trained on the Resnet model (380) to classify the characters. The classified characters are then fed to the post-processing block (390), where the post-processing block (390) combines model the tare weight (410) and DPT code (400) in a readable format.
[0056] FIG. 8 illustrates a schematic representation of an exemplary working of a Deep Learning based model using the Rotational Region Convolution Neural Network (R2CNN) model (420) of FIG. 6 in accordance with an embodiment of the present disclosure. The R2CNN model (420) uses a region proposal network to generate the axis aligned boxes (450) which includes the tare weight and DPT code with orientation. The R2CNN model (420) extracts the features (440) from the image segment of the DPT code and the tare weight with different pooled size for the regions proposed by the RPN (430). The Fast RCNN network trains such extracted features (440) such as axis aligned box coordinates (460), inclined box coordinates (480) and corresponding box scores (470) with predicted class for the individual characters with orientated characters. The R2CNN model (420) further applies an inclined non-max suppression technique (485) to filter the inclined boxes (450) based on a predefined threshold value. The characters obtained from extraction are “B” “2” and “9” and “1” “5” “.” and “9”.
[0057] Referring back to FIG. 6, the character reading module (40) concatenated the characters “B” “2” and “9” and “1” “5” “.” and “9” such as B29 to provide readable DPT code (400) and 15.9 to provide tare weight (410) in a readable form image based on the location and orientation generated from the model. The image processing module (20) determines whether the readable form image is correctly generated or wrongly generated. In this example, the image processing determines that the readable form image is correctly regenerated and not further fine-tuning is not required.
[0058] FIG. 9 is a block diagram of a computer or a server (490) in accordance with an embodiment of the present disclosure. The server (490) includes processor(s) (520), and memory (500) coupled to the processor(s) (520).
[0059] The processor(s) (520), as used herein, means any type of computational circuit, such as, but not limited to, a microprocessor, a microcontroller, a complex instruction set computing microprocessor, a reduced instruction set computing microprocessor, a very long instruction word microprocessor, an explicitly parallel instruction computing microprocessor, a digital signal processor, or any other type of processing circuit, or a combination thereof.
[0060] The memory (500) includes a plurality of modules stored in the form of executable programs which instructs the processor (520) via a bus (510) to perform the method steps illustrated in Fig 1. The memory (500) has the following modules: the image acquisition module (20), the image processing module (30), the character reading module (40) and the character display module (50).
[0061] The image acquisition module (20) is configured to receive the plurality of captured image frames. The image acquisition module (20) is also configured to select one or more image frames from the plurality of captured image frames comprising one or more identification marks. The image processing module (30) is configured to assign one or more labels corresponding to the one or more identification marks in relation to each of one or more selected image frames. The image processing module (30) is also configured to segment one or more characters present in the one or more identification marks corresponding to each of one or more assigned labels. The image processing module (30) is also configured to extract one or more segmented characters present in the one or more identification marks via an image classification technique.
[0062] The character reading module (40) is configured to concatenate one or more extracted characters to generate one or more readable identification marks. The character display module (50) is configured to display the one or more readable identification marks in a readable presentation for a user interpretation.
[0063] Computer memory elements may include any suitable memory device(s) for storing data and executable program, such as read only memory, random access memory, erasable programmable read only memory, electrically erasable programmable read only memory, hard drive, removable media drive for handling memory cards and the like. Embodiments of the present subject matter may be implemented in conjunction with program modules, including functions, procedures, data structures, and application programs, for performing tasks, or defining abstract data types or low-level hardware contexts. Executable programs stored on any of the above-mentioned storage media may be executable by the processor(s) (520).
[0064] FIG. 10 is a flow chart representing the steps involved in a method (530) for optical character reading for an automatic LPG cylinder filling operation in accordance with an embodiment of the present disclosure.
[0065] The method (530) includes receiving a plurality of captured image frames corresponding to an LPG cylinder in step 540. In one embodiment, receiving the plurality of captured image frames corresponding to the LPG cylinder includes receiving the plurality of captured image frames corresponding to the LPG cylinder by an image acquisition module.
[0066] The method (530) also includes selecting one or more image frames from the plurality of captured image frames comprising one or more identification marks in step 550. In one embodiment, selecting the one or more image frames from the plurality of captured image frames comprising the one or more identification marks includes selecting the one or more image frames from the plurality of captured image frames comprising the one or more identification marks by the image acquisition module. In another embodiment, selecting the one or more image frames from the plurality of captured image frames comprising the one or more identification marks includes selecting based on the one or more identification marks comprising at least one of tare weight of the LPG cylinder, DPT code of the LPG cylinder or a combination thereof.
[0067] The method (530) also includes assigning one or more labels corresponding to the one or more identification marks in relation to each of one or more selected image frames in step 560. In one embodiment, assigning the one or more labels corresponding to the one or more identification marks in relation to each of the one or more selected image frames includes assigning the one or more labels corresponding to the one or more identification marks in relation to each of the one or more selected image frames by an image processing module. In another embodiment, assigning the one or more labels corresponding to the one or more identification marks in relation to each of the one or more selected image frames includes assigning different label values according to distinct image frame pixels to each of one or more selected image frames.
[0068] The method (530) also includes segmenting one or more characters present in the one or more identification marks corresponding to each of one or more assigned labels in step 570. In one embodiment, segmenting the one or more characters present in the one or more identification marks corresponding to each of the one or more assigned labels includes segmenting the one or more characters present in the one or more identification marks corresponding to each of the one or more assigned labels by the image processing module. In another embodiment, segmenting the one or more characters present in the one or more identification marks corresponding to each of the one or more assigned labels includes segmenting the one or more characters present in the one or more identification marks using one or more boundaries by implementation of deep learning-based techniques, for example contouring technique.
[0069] The method (530) also includes extracting one or more segmented characters present in the one or more identification marks via an image classification technique in step 580. In one embodiment, extracting the one or more segmented characters present in the one or more identification marks via the image classification technique includes extracting the one or more segmented characters present in the one or more identification marks by the image processing module. In another embodiment, extracting the one or more segmented characters present in the one or more identification marks via the image classification technique includes extracting via an image classification technique comprising a UNET based classification model or a rotational region convolution neural network (R2CNN) model.
[0070] The method (530) also includes grading the image as a correctly read image or a wrongly read image. In one embodiment, grading the image as the correctly read image or the wrongly read image includes grading the image as the correctly read image or the wrongly read image by the image processing module. In one embodiment, grading the image as the correctly read image or the wrongly read image includes fine tuning the wrongly read image based on a deep learning-based model.
[0071] The method (530) also includes concatenating one or more extracted characters to generate one or more readable identification marks in step 590. In one embodiment, concatenating the one or more extracted characters to generate the one or more readable identification marks includes concatenating the one or more extracted characters to generate the one or more readable identification marks by a character reading module.
[0072] The method (530) also includes displaying the one or more readable identification marks in a readable presentation for a user interpretation in step 600. In one embodiment, displaying the one or more readable identification marks in the readable presentation for the user interpretation includes displaying the one or more readable identification marks in the readable presentation for the user interpretation by a character display module. In another embodiment, displaying the one or more readable identification marks in the readable presentation for the user interpretation includes displaying the readable presentation comprising the LPG cylinder inspection results, the LPG cylinder inspection details and the LPG cylinder inspection statistics.
[0073] Various embodiments of the system and method for optical character reading corresponding to LPG cylinders as described above enable higher accuracy for reading the characters as the system uses deep learning-based models as compared to the conventional systems. The system may read all different configurations where the printed identification marks are from top to bottom, bottom to top, where DPT code and tare weight are combined on a single tare weight on the LPG container and where print is increasingly difficult to read.
[0074] The system dynamically learns new characters as well as dynamically adapts for different datasets through deep learning and transfer learning mechanisms, thereby ensuring correct recognition of a large number of different characters as well as a large number of variations for each character. Thus, the models once trained to perform OCR on some type of datasets may be easily trained and put to use on similar other types of datasets even when the other type of datasets have insufficient training data, noisy data, or corrupt data. Further, dynamic building and training of machine learning models based on deep learning and transfer learning mechanisms ensures that the techniques, described in the embodiments discussed above, are accurate and robust for a large number of different characters as well as a large number of variations for each character.
[0075] While specific language has been used to describe the disclosure, any limitations arising on account of the same are not intended. As would be apparent to a person skilled in the art, various working modifications may be made to the method in order to implement the inventive concept as taught herein.
[0076] The figures and the foregoing description give examples of embodiments. Those skilled in the art will appreciate that one or more of the described elements may well be combined into a single functional element. Alternatively, certain elements may be split into multiple functional elements. Elements from one embodiment may be added to another embodiment. For example, order of processes described herein may be changed and are not limited to the manner described herein. Moreover, the actions of any flow diagram need not be implemented in the order shown; nor do all of the acts need to be necessarily performed. Also, those acts that are not dependent on other acts may be performed in parallel with the other acts. The scope of embodiments is by no means limited by these specific examples.
| # | Name | Date |
|---|---|---|
| 1 | 202041040372-STATEMENT OF UNDERTAKING (FORM 3) [17-09-2020(online)].pdf | 2020-09-17 |
| 2 | 202041040372-FORM FOR SMALL ENTITY(FORM-28) [17-09-2020(online)].pdf | 2020-09-17 |
| 3 | 202041040372-FORM FOR SMALL ENTITY [17-09-2020(online)].pdf | 2020-09-17 |
| 4 | 202041040372-FORM 1 [17-09-2020(online)].pdf | 2020-09-17 |
| 5 | 202041040372-EVIDENCE FOR REGISTRATION UNDER SSI(FORM-28) [17-09-2020(online)].pdf | 2020-09-17 |
| 6 | 202041040372-EVIDENCE FOR REGISTRATION UNDER SSI [17-09-2020(online)].pdf | 2020-09-17 |
| 7 | 202041040372-DRAWINGS [17-09-2020(online)].pdf | 2020-09-17 |
| 8 | 202041040372-DECLARATION OF INVENTORSHIP (FORM 5) [17-09-2020(online)].pdf | 2020-09-17 |
| 9 | 202041040372-COMPLETE SPECIFICATION [17-09-2020(online)].pdf | 2020-09-17 |
| 10 | 202041040372-Proof of Right [18-09-2020(online)].pdf | 2020-09-18 |
| 11 | 202041040372-MSME CERTIFICATE [18-09-2020(online)].pdf | 2020-09-18 |
| 12 | 202041040372-FORM28 [18-09-2020(online)].pdf | 2020-09-18 |
| 13 | 202041040372-FORM-9 [18-09-2020(online)].pdf | 2020-09-18 |
| 14 | 202041040372-FORM-26 [18-09-2020(online)].pdf | 2020-09-18 |
| 15 | 202041040372-FORM 18A [18-09-2020(online)].pdf | 2020-09-18 |
| 16 | 202041040372-RELEVANT DOCUMENTS [12-01-2021(online)].pdf | 2021-01-12 |
| 17 | 202041040372-OTHERS [12-01-2021(online)].pdf | 2021-01-12 |
| 18 | 202041040372-MARKED COPIES OF AMENDEMENTS [12-01-2021(online)].pdf | 2021-01-12 |
| 19 | 202041040372-FORM-26 [12-01-2021(online)].pdf | 2021-01-12 |
| 20 | 202041040372-FORM 13 [12-01-2021(online)].pdf | 2021-01-12 |
| 21 | 202041040372-FER_SER_REPLY [12-01-2021(online)].pdf | 2021-01-12 |
| 22 | 202041040372-CLAIMS [12-01-2021(online)].pdf | 2021-01-12 |
| 23 | 202041040372-AMMENDED DOCUMENTS [12-01-2021(online)].pdf | 2021-01-12 |
| 24 | 202041040372-Correspondence to notify the Controller [21-04-2021(online)].pdf | 2021-04-21 |
| 25 | 202041040372-Written submissions and relevant documents [11-05-2021(online)].pdf | 2021-05-11 |
| 26 | 202041040372-PatentCertificate08-06-2021.pdf | 2021-06-08 |
| 27 | 202041040372-IntimationOfGrant08-06-2021.pdf | 2021-06-08 |
| 28 | 202041040372-US(14)-HearingNotice-(HearingDate-23-04-2021).pdf | 2021-10-18 |
| 29 | 202041040372-FER.pdf | 2021-10-18 |
| 30 | 202041040372-RELEVANT DOCUMENTS [30-09-2022(online)].pdf | 2022-09-30 |
| 31 | 202041040372-RELEVANT DOCUMENTS [30-09-2022(online)]-1.pdf | 2022-09-30 |
| 1 | searchE_19-10-2020.pdf |