Sign In to Follow Application
View All Documents & Correspondence

Context Aware Ontology Based Information Extraction

Abstract: Systems and methods for context aware ontology based information extraction are described. In one embodiment, the method comprises pre-processing an unstructured text document to obtain an induced tree, wherein the induced tree represents words and grammatical relations between the words in the unstructured text document as induced tree nodes. Further, the method comprises creating an object graph based on the induced tree, wherein the object graph comprises a plurality of object graph nodes including entity nodes, property nodes, and relation nodes. Furthermore, the method comprises identifying an ontological type of each of the plurality of the object graph nodes based at least on the entropy scores of the entity nodes, and generating structured information from the unstructured text document upon identifying the ontological type.

Get Free WhatsApp Updates!
Notices, Deadlines & Correspondence

Patent Information

Application #
Filing Date
14 December 2012
Publication Number
32/2014
Publication Type
INA
Invention Field
COMPUTER SCIENCE
Status
Email
Parent Application
Patent Number
Legal Status
Grant Date
2020-08-05
Renewal Date

Applicants

TATA CONSULTANCY SERVICES LIMITED
Nirmal Building  9th Floor  Nariman Point  Mumbai  Maharashtra 400021

Inventors

1. SHAH  Sapankumar Hiteshchandra
Tata Research Development and Design Centre  Tata Consultancy Services  54 B Hadapsar Industrial Estate  Pune 411 013
2. REDDY  Sreedhar Sanareddy
Tata Research Development and Design Centre  Tata Consultancy Services  54 B Hadapsar Industrial Estate  Pune 411 013

Specification

FORM 2 THE PATENTS ACT, 1970 (39 of 1970) & THE PATENTS RULES, 2003 COMPLETE SPECIFICATION (See section 10, rule 13) 1. Title of the invention: CONTEXT AWARE ONTOLOGY BASED INFORMATION EXTRACTION 2. Applicant(s) NAME NATIONALITY ADDRESS TATA CONSULTANCY Indian Nirmal Building, 9th Floor, Nariman SERVICES LIMITED Point, Mumbai, Maharashtra 400021, India 3. Preamble to the description COMPLETE SPECIFICATION The following specification particularly describes the invention and the manner in which it is to be performed. TECHNICAL FIELD [0001] The present subject matter relates, in general, to information extraction and, in particular, to a system and a method for context aware ontology based information extraction. BACKGROUND [0002] In today's world, enormous amount of information pertaining to different domains of interests is available on the World Wide Web in a scattered and unstructured manner. Extraction and management of information has always been an active field of research. Information extraction (IE) is the task of extracting useful piece of information from unstructured or semi-structured documents, such as research papers, blogs, published articles, e-books, and the like. [0003] Various systems for information extraction have been developed in the past. Recently, Ontology based information extraction (OBIE) has emerged as a sub-field of IE where ontologies are used in the IE process. An ontology is defined as a formal and explicit specification of a shared conceptualization. Ontologies play a central role in OBIE by providing a formal means for specifying IE targets and a structure for storing extracted information. Ontology represents a domain in a hierarchical manner and models a domain terminology in terms of concepts, properties, and relations, which can be used to specify IE targets. OBIE system takes domain ontology as input and uses various IE techniques to discover instances of domain specific concepts and their property values. SUMMARY [0004] This summary is provided to introduce concepts related to context aware ontology based information extraction. These concepts are further described below in the detailed description. This summary is not intended to identify essential features of the claimed subject matter nor is it intended for use in determining or limiting the scope of the claimed subject matter. [0005] In one embodiment, the method for context aware ontology based information extraction comprises pre-processing an unstructured text document to obtain an induced tree, wherein the induced tree represents words and grammatical relations between the words in the unstructured text document as induced tree nodes. Further, the method comprises creating an object graph based on the induced tree, wherein the object graph comprises a plurality of object graph nodes including entity nodes, property nodes, and relation nodes. Furthermore, the method comprises identifying an ontological type of each of the plurality of the object graph nodes based at least on the entropy scores of the entity nodes, and generating structured information from the unstructured text document upon identifying the ontological type. BRIEF DESCRIPTION OF THE DRAWINGS [0006] The detailed description is described with reference to the accompanying figure(s). In the figure(s), the left-most digit(s) of a reference number identifies the figure in which the reference number first appears. The same numbers are used throughout the figure(s) to reference like features and components. Some embodiments of systems and/or methods in accordance with embodiments of the present subject matter are now described, by way of example only, and with reference to the accompanying figure(s), in which: [0007] Figure 1 illustrates a network environment implementing an information extraction system, according to an embodiment of the present subject matter. [0008] Figure 2 illustrates components of the information extraction system, according to an embodiment of the present subject matter. [0009] Figure 3 illustrates a method for context aware ontology based information extraction according to an embodiment of the present subject matter. DETAILED DESCRIPTION [0010] Ontology is a hierarchical arrangement of a domain that represents a domain terminology in terms of concepts (classes), properties (data type properties), and relations (object type properties). For example, an ontology of geopolitical entities domain may include concepts, such as country, organization, etc. Each country may have various properties or attributes, such as population, currency, area, and the like. Further, the concepts in the ontology may have relations, for example, India borders with Pakistan. Here, borders with is the relation between the concepts India and Pakistan. [0011] Ontology based information extraction (OBIE) involves extracting information pertaining to a particular domain from unstructured documents, identifying entities and their properties in the documents, and relating such entities to concepts in the ontology. The unstructured documents referred herein may be research papers, blogs, published articles, e-books, and the like. Typically, OBIE systems are broadly classified as ontology learning systems and ontology population systems. The task of an ontology learning OBIE system is to construct domain specific concepts and properties from unstructured texts. Whereas, an ontology population OBIE systems extracts instances of domain specific concepts and their property values for a given domain ontology. [0012] Conventionally, various approaches to OBIE are available. Such approaches can broadly be classified into machine learning based approaches, and rules based approaches. Conventional machine learning based approaches involve assessment of manually tagged data to learn different models for information extraction. For example, Pandora, a popular internet radio service, employs musicologists to annotate songs with a fixed vocabulary of about five hundred tags. Pandora then creates a personalized music playlist by finding songs that share a large number of tags with a user specified seed song. After about 10 years of effort by about 50 full time musicologists, less than one million songs have been manually annotated, representing less than 5% of the current iTunes, a musical record store, and catalog. However, manual tagging of data is tedious and time consuming job. Further, manual tagging is largely dependent on experts for tagging data pertaining to a domain. Moreover, manual tagging lacks domain compatibility as the manual tagging for one domain may not be applicable to other domains. For each specific domain, the tagging may vary. [0013] Conventional rule based approach to OBIE employ manually coded rules to extract entities and relations of interest, pertaining to a given domain, from a given unstructured document. However, most of the rule based OBIE systems use various domain-specific rules for extracting information from the unstructured documents. Such rules are designed and built to operate on specific domains, and are typically of no value if used for other domains. Moreover, significant time is consumed to come up with the rules. [0014] In accordance with the present subject matter, a method and a system for context aware ontology based information extraction (CAOBIE) is described. In an embodiment, a CAOBIE system is configured to extract entities, their properties and relations from a given unstructured text document, and relate the same to concepts in the domain ontology. For this purpose, the CAOBIE system includes a pattern matcher. The pattern matcher utilizes a plurality of patterns that are written using a pattern language rich in linguistic features for extracting entities, properties and relations from the unstructured text document. In one implementation, the patterns referred herein are domain-independent and can be therefore utilized for extracting entities, properties and relations corresponding to any domain. [0015] In one implementation, the pattern matcher is configured to initially extract the properties and relations from the unstructured text document. Further, the pattern matcher is configured to take cues from the extracted properties and relations upon matching the properties and relations with the patterns, to extract the entities from the unstructured text document. [0016] Once the entities, properties, and relations are identified, an object graph is created. The object graph represents the identified entities, properties, and relations in the form of entity nodes, property nodes, and relation nodes respectively, collectively called object graph nodes. Subsequent to the creation of the object graph, the CAOBIE system utilizes a global context aware type identification algorithm for identifying an ontological type of the entity nodes. In one implementation, the global context aware type identification algorithm uses global level information, such as information included in related entity nodes in addition to local contextual information provided by property and relation nodes for identifying the ontological type of these nodes. [0017] Use of the global level information helps in making precise decision for identifying the ontological type of the entity nodes. Further, the concept of entropy is used to determine the uncertainty associated with the ontological type of the entity nodes. The information related to the ontological type is then propagated through the object graph from the entity nodes having low entropy based score to the entity nodes having high entropy based score in an iterative manner. In one implementation, the CAOBIE system determines the ontological type of the property nodes and relation nodes based on computing similarity scores for the property nodes and the relation nodes. The similarity scores are computed, for example, by comparing the property nodes and the relation nodes with corresponding properties and relations in a domain ontology. [0018] Once the ontological type of the object graph nodes including the entity nodes, relation nodes and property nodes are determined, the object graph nodes along with their ontological type information are then serialized to RDF notation. In one implementation, the CAOBIE system is configured to process the unstructured text document, to create object graph and subsequently generate the structured information by converting the object graph to RDF notation. In one implementation, the structured information, or the information, thus, extracted can be stored in a data store that can be queried for retrieving domain specific information stored therein. [0019] The systems and the methods in accordance with the present subject matter provide an efficient ontology based information extraction. The CAOBIE system implements generically written patterns which are applicable across different domains, thereby eliminating the frequent need for a domain expert to write/re-write patterns for different domains. Further, the context aware approach using global level information, presented in the global context aware type identification algorithm helps in precisely extracting the information from a given unstructured text document. [0020] The following disclosure describes the system and the method for context aware ontology based information extraction system. While aspects of the described system and method can be implemented in any number of different computing systems, environments, and/or configurations, embodiments for the context aware ontology based information extraction are described in the context of the following exemplary system(s) and method(s). [0021] Figure 1 illustrates a network environment 100 implementing an information extraction system 102, in accordance with an embodiment of the present subject matter. [0022] In one implementation, the network environment 100 can be a public network environment, including thousands of personal computers, laptops, various servers, such as blade servers, and other computing devices. In another implementation, the network environment 100 can be a private network environment with a limited number of computing devices, such as personal computers, servers, laptops, and/or communication devices, such as mobile phones and smart phones. [0023] The information extraction system 102 is communicatively connected to a plurality of user devices 106-1, 106-2, 106-3...,and 106-N, collectively referred to as user devices 106 and individually referred to as a user device 106, through a network 108. In one implementation, a plurality of users may use the user devices 106 to communicate with the information extraction system 102. In said implementation, the information extraction system 102 is further connected to a data store 104 through the network 106. Though the data store 104 is shown external to the information extraction system 102, it is well appreciated that the data store 104, in another implementation, can be integrated within the information extraction system 102. [0024] The information extraction system 102 and the user devices 106 may be implemented in a variety of computing devices, including, servers, a desktop personal computer, a notebook or portable computer, a workstation, a mainframe computer, a laptop and/or communication device, such as mobile phones and smart phones. Further, in one implementation, the information extraction system 102 may be a distributed or centralized network system in which different computing devices may host one or more of the hardware or software components of the information extraction system 102. [0025] The information extraction system 102 may be connected to the user devices 106 over the network 108 through one or more communication links. The communication links between the information extraction system 102 and the user devices 106 are enabled through a desired form of communication, for example, via dial-up modem connections, cable links, digital subscriber lines (DSL), wireless, or satellite links, or any other suitable form of communication. [0026] The network 108 may be a wireless network, a wired network, or a combination thereof. The network 108 can also be an individual network or a collection of many such individual networks, interconnected with each other and functioning as a single large network, e.g., the Internet or an intranet. The network 108 can be implemented as one of the different types of networks, such as intranet, local area network (LAN), wide area network (WAN), the internet, and such. The network 108 may either be a dedicated network or a shared network, which represents an association of the different types of networks that use a variety of protocols, for example, Hypertext Transfer Protocol (HTTP), Transmission Control Protocol/Internet Protocol (TCP/IP), etc., to communicate with each other. Further, the network 108 may include network devices, such as network switches, hubs, routers, for providing a link between the information extraction system 102 and the user devices 106. The network devices within the network 108 may interact with the information extraction system 102, and the user devices 106 through the communication links. [0027] According to an implementation of the present subject matter, the information extraction system 102 may be configured for extracting information pertaining to a particular domain of interest. For this purpose, the information extraction system 102 may include a domain ontology corresponding to the domain of interest pre-stored in a repository associated with the information extraction system 102. Domain ontology includes concepts related to the domain, their properties, and relations between the concepts. Such a domain ontology act as a knowledge base for a particular domain. [0028] Although, the information extraction according to the present description is explained with reference to a single domain of interest, it will be apparent to a person skilled in the art that the information extraction can be performed for multiple domains of interest, and multiple domain ontologies can be provided for this purpose. [0029] In one implementation, the information extraction system 102 may be configured to populate and enrich the domain ontology using annotations. For example, the concepts of the domain ontology can be enriched with description annotations describing the meaning of the concepts. Enrichment of the concepts enables determining initial probability of an entity having a particular concept type. For each concept in the domain ontology, similarity values/scores between the words in the context of a given entity and the words in the concept description is evaluated. The similarity values are then normalized to get initial probability values. [0030] Further, for each ontological concept in the domain ontology, relative identification weights are assigned for its properties and relations. The identification weights indicate the relative importance of a property or relation in identifying the concept. For example, consider an Organization domain with two concepts, say, Employee and Department, and three properties, say, Employee.name, Department.name, and Employee.reports_to. Here, the occurrence of reports_to in text can provide cues that the type of the associated entity is Employee, unlike name. Therefore, reports_to is given more identification weight than name. [0031] In one implementation, the information extraction system 102 may be configured to provide annotations by adding synonyms for the concepts, properties and relations present in the domain ontology. Further, the information extraction system 102 may be configured for specifying stricter constraints on the values of the properties related to the concepts. For the purpose, the domain ontology is enriched with value pattern annotations that specify a regular expression pattern that the values of the property should match. For example, in a camera review domain with a property Camera.megapixel, the regex for the value pattern may be specified as: \d+d(\.\d+)?(mp|megapixel). [0032] In one implementation, the information extraction system 102 is configured to extract information from an unstructured text document based on the domain ontology. The unstructured text document may be a research paper, a blog post, a news article, and the like, pertaining to the domain. To extract the information, the information extraction system 102 is configured to perform a series of pre-processing steps over the unstructured text document. During the pre-processing of the unstructured text document, linguistic features of the unstructured text document are extracted and a dependency tree is generated. In one implementation, the information extraction system 102 is configured to extract the linguistic features from the unstructured text document based on conventionally known natural language processing technique. A dependency tree provides a representation of the grammatical relations between the words in a sentence. For example, in a dependency tree, a node represents the words and the edges represent the grammatical relations. [0033] In one implementation, the information extraction system 102 is configured to generate an induced tree based on the dependency tree. In the said implementation, the induced tree represents the words, as well as the grammatical relation between the words in the form of nodes, hereinafter referred to as induced tree nodes. Further, a set of tree transformations are applied to the induced tree using a conventionally available tree transformation language to handle conjunctions in the induced tree. [0034] In one implementation, the pattern matching module 110 is configured to process the induced tree and extract information pertaining to the domain from the induced tree, based on the domain ontology. The pattern matching module 110 applies a plurality of predefined patterns on the induced tree to generate a graph structure, hereinafter referred to as object graph. In one implementation, the object graph includes a plurality of object graph nodes representing different entities, properties and relations. For example, the entities are represented as entity nodes, properties are represented as property nodes, and relationship between entity nodes are represented as relation nodes in the object graph. In an example, the entity node represents an instance of a domain entity found in the unstructured text document, the property node links an entity node with its property value, and the relation node links two entity nodes that represent domain and range of some ontological object property. In one implementation, the entity nodes are determined based on identification weights assigned to the properties and relations in the domain ontology. [0035] In one implementation, the type identification module 112 is configured to identify an ontological type of the object graph nodes. For example, the ontological type for the three type of object graph nodes are: concepts for entity nodes; data property for property nodes; and object property for relation nodes. In one implementation, the type identification module 112 may use the standard Hearst patterns, a concept identification pattern, for identifying the ontological type of the entity nodes in the object graph. In another implementation, the type identification module 112 may be configured to determine the ontological type of the entity nodes based on pre-assigned identification weights assigned to properties and relations in the domain ontology. Once the ontological type of the entity nodes is determined, the type identification module 112 can be configured to determine the uncertainty associated with identified ontological type of the entity nodes, in one implementation. The type identification module 112, for example, utilizes the conventionally known concept of entropy to determine the uncertainty associated with each of the entity nodes in the object graph, and assign correct or certain ontological type for such entity nodes. Subsequent to the identification of the certain identification type, the type identification module 112 may be configured to propagate the information related to the certain ontological type of the entity node through the remaining entity nodes. Once the certain ontological type is identified for the entity nodes, the entity nodes are related or associated with the corresponding concepts in the domain ontology. [0036] In one implementation, the information extraction system 102 may serialize the object graph to RDF notation and maintain the RDF notation in the data store 104, which can be queried to retrieve the information pertaining to the domain stored therein. [0037] Figure 2 illustrates components of an information extraction system 102 according to an embodiment of the present subject matter [0038] In one implementation, the information extraction system 102 includes one or more processor(s) 202, I/O interfaces 204, and a memory 206 coupled to the processor 202. The processor 202 can be a single processing unit or a number of units, all of which could include multiple computing units. The processor 202 may be implemented as one or more microprocessors, microcomputers, microcontrollers, digital signal processors, central processing units, state machines, logic circuitries, and/or any devices that manipulate signals based on operational instructions. Among other capabilities, the processor 202 is configured to fetch and execute computer-readable instructions and data stored in the memory 206. [0039] The I/O interfaces 204 may include a variety of software and hardware interfaces, for example, interfaces for peripheral device(s), such as a keyboard, a mouse, a display unit, an external memory, and a printer. Further, the I/O interfaces 204 may enable the clinical decision support system 102 to communicate with other devices, such as web servers and external databases. The I/O interfaces 204 can facilitate multiple communications within a wide variety of networks and protocol types, including wired networks, for example, local area network (LAN), cable, etc., and wireless networks, such as Wireless LAN (WLAN), cellular, or satellite. For the purpose, the I/O interfaces 204 may include one or more ports for connecting a number of computing systems with one another or to a network. [0040] The memory 206 may include any non-transitory computer-readable medium known in the art including, for example, volatile memory, such as static random access memory (SRAM) and dynamic random access memory (DRAM), and/or non-volatile memory, such as read only memory (ROM), erasable programmable ROM, flash memories, hard disks, optical disks, and magnetic tapes. In one implementation, the information extraction system 102 also includes module(s) 208 and data 210. [0041] The module(s) 208, amongst other things, include routines, programs, objects, components, data structures, etc., which perform particular tasks or implement data types. The module(s) 208 may also be implemented as, signal processor(s), state machine(s), logic circuitries, and/or any other device or component that manipulate signals based on operational instructions. [0042] Further, the module(s) 208 can be implemented in hardware, instructions executed by a processing unit, or by a combination thereof. The processing unit can comprise a computer, a processor, such as the processor 202, a state machine, a logic array or any other suitable devices capable of processing instructions. The processing unit can be a general-purpose processor which executes instructions to cause the general-purpose processor to perform the required tasks or, the processing unit can be dedicated to perform the required functions. [0043] In another aspect of the present subject matter, the module(s) 208 may be machine-readable instructions (software) which, when executed by a processor/processing unit, perform any of the described functionalities. The machine-readable instructions may be stored on an electronic memory device, hard disk, optical disk or other machine-readable storage medium or non-transitory medium. In one implementation, the machine-readable instructions can be also be downloaded to the storage medium via a network connection. [0044] In one implementation, the module(s) 208 further include a pre-processing module 212, a pattern matching module 110, a type identification module 112, a structuring module 214, and other module(s) 216. The other modules 216 may include programs or coded instructions that supplement applications and functions of the information extraction system 102. [0045] The data 210 serves, amongst other things, as a repository for storing data processed, received, and generated by one or more of the modules 208. The data 210 includes pre-processing data 218, pattern matching data 220, type identification data 222, structured data 224, and other data 226. The other data 226 includes data generated as a result of the execution of one or more modules in the modules 208. [0046] In one implementation, the information extraction system 102 may be configured to extract information, pertaining to a particular domain of interest, from an unstructured text document. For extracting the information, the pre-processing module 212 may be configured to process the enrichments such as annotations and descriptions captured in the domain ontology as described previously. Enrichment of the domain ontology helps in identification and classification of entities and their property values present in the unstructured text document. The pre-processing module 212 performs a series of pre-processing steps, such as identifying linguistic features from the unstructured text document, generating an induced tree, and tree transformations. In one implementation, the pre-processing module 212 is configured to identify linguistic features from the unstructured text documents using a natural language processing technique. Further, for generating the induced tree, the pre-processing module 212 utilizes conventionally known dependencies, for example, Stanford dependencies (SD). The SD includes a total of 53 grammatical relations. The grammatical relations are used to locate the entities once the property and relation occurrences are found. The SD provides a representation of the grammatical relations between the words in a sentence of the unstructured text document. The words of the sentence along with the grammatical relations form a dependency tree, where nodes of the dependency tree represent the words and edges represent the grammatical relations. [0047] Subsequently, the pre-processing module 212 is configured to generate the induced tree from the dependency tree. The induced tree, thus, generated represents words and the grammatical relations between the words in the sentence as induced tree nodes. In one implementation, the pre-processing module 212 applies a set of tree transformations to the induced tree to refine the induced tree. In an example, the pre-processing module 212 may store the induced tree in the pre-processing data 218. In an example, the pre-processing module 212 applies a conventionally know tree transformation language such as Stanford TSurgeon for executing these tree transformation patterns. An exemplary illustration of the tree transformation patterns is provided in table 1 below. Table 1 Tree TRegex Pattern - TSurgeon Operations Remarks Transformation condition ConjunctionAnd /. */=head < (cc=vCC move brother $- head; All the conjuncts in and < delete vConj conjunction becomes and=vAnd) < Siblings; children of (conj=vConj < Parent of head conjunct. /. */=brother) (India borders with Pakistan and China) CompoundNoun /. */=head < (nn=vNN accumulate compound Words in a compound noun < head are considered as single unit /. */=compound) compound; e.g. India borders with Sri excise vNN compound Lanka; Sri Lanka is stored as a single induced tree node. ModifierList /. */=head < accumulate modifier All modifiers are stored (/. *mod. */=vMod head along with an induced < /.*/=modifier) modifier; excise vMod tree node of word that they modifier modify. CompoundNumber /. */=head < prune vNumber All the words in compound (numb er=vNumb er number are treated as < /. */=compound) a single node e.g. I lost $ 3.2 billion. Here, $ 3.2 billion is treated as a single node of number type. [0048] Once the induced tree is generated, the pattern matching module 110 is configured to extract information, such as entities, properties and relations, from the induced tree based on matching the words and relations in the induced tree with a plurality of predefined patterns. In one implementation, the pattern matching module 110 is configured to initially extract properties and relations from the induced tree. Based on the properties and relations, the pattern matching module 110 extract entities from the induced tree. [0049] The patterns referred herein are written in a pattern language. A pattern consists of a premise and a sequence of actions. The premise is a set of conditions that should hold true for the actions to be executed. In one implementation, the premise consists of tree paths, ontological constraints, and Boolean expressions. The action component in the pattern specifies a sequence of actions to be performed over variable bindings from the premise. Further, the basic constituent used in the action is assignment. The pattern language makes use of language constructs to explicitly refer to various ontological elements such as concepts, property, and relations. An exemplary subset of the grammar of the pattern language is provided below. "patterns:- pattern* pattern:- patternID "{" premise "}" "->" "{" actions "}" patternID:- (DIGIT)+ premise:- (treePath ";")+ (ontologyConstraint ";")+ ("{" boolean_expression "}" ";")? treePath:-element| element"--" treePath ontologyConstraint:- ontologyElement = variable actions:- ("{" action + "}")+ action :- LHS = RHS ";" LHS:- ontologyActionElement | variable RHS:-variable |identifier |action_function" [0050] The patterns written using the pattern language helps in identification of properties and relations, and thereby helps in identifying the entities in the unstructured text document. In one implementation, the tree paths and the ontology constraints are defined in such a manner in the pattern language, that the patterns are generic and are applicable across different domains. An example of such generic, domain independent patterns is provided in table 2 below. Table 2 India has a coastline of 7515 km. property extraction 1 { -- dobj -- -- prep -- of -- pobj -- ; property = ; -- nsubj -- ; {isRoot() && isTypeMatching(, Number)}; } -> {source=; target=; property= } Ratan Tata launched Tata Nano in 2010. Relation extraction 2 { -- nsubj -- ; -- dobj -- ; relation = ; {isRoot()};} -> {source = ; target = ; relation = ; } India is a country in South Asia. Concept Identification 3 { -- nsubj -- ; -- cop; class = ;} -> {class = ; entity = ; } [0051] The first two patterns shown in the table 2 are based on property extraction and relation extraction respectively. The last pattern shows one of the class identification patterns written using the pattern language. [0052] In one implementation, the pattern matching module 110 generates a graph like structure, hereinafter referred to as object graph, based on the patterns applied on the induced tree. The object graph includes a plurality of object graph nodes representing entities, properties and relations. For example, an entity node represents an instance of a domain entity found in the unstructured text document, a property node linking an entity node with its property values, and a relation node linking two entity nodes that represent domain and range of some ontological object property. In an example, the object graph, thus, generated can be stored within the pattern matching data 220 by the pattern matching module 110. [0053] Subsequent to the generation of the object graph, the type identification module 112 is configured to determine an ontological type of each of the object graph nodes with respect to the domain ontology. As indicated previously, the object graph nodes include entity nodes, property nodes, and relation nodes. The ontological type of the object graph nodes indicates concepts in the domain ontology for entity nodes, data properties for the property nodes, and object properties for the relation nodes. In one implementation, the type identification module 112 determines the ontological type of the property nodes and relation nodes based on computing similarity scores for the property nodes and relation nodes of the object graph. The similar scores can be computed based on comparing the property nodes and the relation nodes with corresponding properties and relations in the domain ontology. In an implementation, the similarity scores for the property nodes and relation nodes are computed using the equation provided below: itypeiA) = argmax { similarity (P^ords.A. words)} where, Pi = i data property in the ontology; Pi.words = words in the data property Pi (including its synonyms); A.words = words occuring in the propertsy node A. argmax is an argument function having a lower limit as 1

Documents

Orders

Section Controller Decision Date

Application Documents

# Name Date
1 3537-MUM-2012-FORM 18(17-12-2012).pdf 2012-12-17
1 3537-MUM-2012-RELEVANT DOCUMENTS [26-09-2023(online)].pdf 2023-09-26
2 3537-MUM-2012-CORRESPONDENCE(17-12-2012).pdf 2012-12-17
2 3537-MUM-2012-RELEVANT DOCUMENTS [27-09-2022(online)].pdf 2022-09-27
3 3537-MUM-2012-IntimationOfGrant05-08-2020.pdf 2020-08-05
3 3537-MUM-2012-FORM 1(18-12-2012).pdf 2012-12-18
4 3537-MUM-2012-PatentCertificate05-08-2020.pdf 2020-08-05
4 3537-MUM-2012-CORRESPONDENCE(18-12-2012).pdf 2012-12-18
5 ABSTRACT1.jpg 2018-08-11
5 3537-MUM-2012-Written submissions and relevant documents (MANDATORY) [22-11-2019(online)].pdf 2019-11-22
6 3537-MUM-2012-FORM 26(1-2-2013).pdf 2018-08-11
6 3537-MUM-2012-Correspondence to notify the Controller (Mandatory) [18-10-2019(online)].pdf 2019-10-18
7 3537-MUM-2012-HearingNoticeLetter-(DateOfHearing-11-11-2019).pdf 2019-10-10
7 3537-MUM-2012-CORRESPONDENCE(1-2-2013).pdf 2018-08-11
8 3537-MUM-2012-FORM 5.pdf 2018-10-03
8 3537-MUM-2012-ABSTRACT [02-04-2019(online)].pdf 2019-04-02
9 3537-MUM-2012-CLAIMS [02-04-2019(online)].pdf 2019-04-02
9 3537-MUM-2012-FORM 3.pdf 2018-10-03
10 3537-MUM-2012-COMPLETE SPECIFICATION [02-04-2019(online)].pdf 2019-04-02
10 3537-MUM-2012-FORM 2.pdf 2018-10-03
11 3537-MUM-2012-DRAWING [02-04-2019(online)].pdf 2019-04-02
11 3537-MUM-2012-FER.pdf 2018-10-08
12 3537-MUM-2012-FER_SER_REPLY [02-04-2019(online)].pdf 2019-04-02
12 3537-MUM-2012-OTHERS [02-04-2019(online)].pdf 2019-04-02
13 3537-MUM-2012-FER_SER_REPLY [02-04-2019(online)].pdf 2019-04-02
13 3537-MUM-2012-OTHERS [02-04-2019(online)].pdf 2019-04-02
14 3537-MUM-2012-DRAWING [02-04-2019(online)].pdf 2019-04-02
14 3537-MUM-2012-FER.pdf 2018-10-08
15 3537-MUM-2012-COMPLETE SPECIFICATION [02-04-2019(online)].pdf 2019-04-02
15 3537-MUM-2012-FORM 2.pdf 2018-10-03
16 3537-MUM-2012-CLAIMS [02-04-2019(online)].pdf 2019-04-02
16 3537-MUM-2012-FORM 3.pdf 2018-10-03
17 3537-MUM-2012-FORM 5.pdf 2018-10-03
17 3537-MUM-2012-ABSTRACT [02-04-2019(online)].pdf 2019-04-02
18 3537-MUM-2012-HearingNoticeLetter-(DateOfHearing-11-11-2019).pdf 2019-10-10
18 3537-MUM-2012-CORRESPONDENCE(1-2-2013).pdf 2018-08-11
19 3537-MUM-2012-FORM 26(1-2-2013).pdf 2018-08-11
19 3537-MUM-2012-Correspondence to notify the Controller (Mandatory) [18-10-2019(online)].pdf 2019-10-18
20 ABSTRACT1.jpg 2018-08-11
20 3537-MUM-2012-Written submissions and relevant documents (MANDATORY) [22-11-2019(online)].pdf 2019-11-22
21 3537-MUM-2012-PatentCertificate05-08-2020.pdf 2020-08-05
21 3537-MUM-2012-CORRESPONDENCE(18-12-2012).pdf 2012-12-18
22 3537-MUM-2012-IntimationOfGrant05-08-2020.pdf 2020-08-05
22 3537-MUM-2012-FORM 1(18-12-2012).pdf 2012-12-18
23 3537-MUM-2012-RELEVANT DOCUMENTS [27-09-2022(online)].pdf 2022-09-27
23 3537-MUM-2012-CORRESPONDENCE(17-12-2012).pdf 2012-12-17
24 3537-MUM-2012-RELEVANT DOCUMENTS [26-09-2023(online)].pdf 2023-09-26
24 3537-MUM-2012-FORM 18(17-12-2012).pdf 2012-12-17

Search Strategy

1 searchreport3537MUM2012_05-10-2018.pdf

ERegister / Renewals

3rd: 07 Aug 2020

From 14/12/2014 - To 14/12/2015

4th: 07 Aug 2020

From 14/12/2015 - To 14/12/2016

5th: 07 Aug 2020

From 14/12/2016 - To 14/12/2017

6th: 07 Aug 2020

From 14/12/2017 - To 14/12/2018

7th: 07 Aug 2020

From 14/12/2018 - To 14/12/2019

8th: 07 Aug 2020

From 14/12/2019 - To 14/12/2020

9th: 07 Aug 2020

From 14/12/2020 - To 14/12/2021

10th: 15 Nov 2021

From 14/12/2021 - To 14/12/2022

11th: 07 Dec 2022

From 14/12/2022 - To 14/12/2023

12th: 08 Dec 2023

From 14/12/2023 - To 14/12/2024

13th: 12 Dec 2024

From 14/12/2024 - To 14/12/2025