Automated Molecular Mining And Activity Prediction Using Xml Schema,

< Back

Automated Molecular Mining And Activity Prediction Using Xml Schema, Xml Queries, Rule Inference And Rule Engines

Abstract: A computer system including hardware and software elements, processing methods and steps provide for automated molecular mining, transformation of structural information for chemically, biologically or pharmacologically related molecules to a hierarchical schema of concepts and descriptors, detecting tree-like patterns in related schema and predicting biological activity using rules inferred from analyzing the patterns. Using XML Schema, XML Queries, Rule Inference and Rule Engines, patterns common to all molecules in a given class or clusters of molecules in the class can be extracted and stored, forming rules that relate structure and activity. Such patterns can be stored as rules for matching with query molecules, thus indicating potential uses of the query molecules. Chemical concepts and descriptors can include functional groups, Ring Systems, Atom and Bond Types, inter alia, and the distances between these entities can be defined in an XML schema, DTD or simple XML file. An XML template file is used to transform a class of molecules with structural data to an XML file, reflecting the tree like structure of the template. The algorithm can find rules for continuous, binary, one class and multi class data.

Get Free WhatsApp Updates!
Notices, Deadlines & Correspondence

Patent Information

Application #

Filing Date

20 June 2008

Publication Number

39/2008

Publication Type

INA

Invention Field

COMPUTER SCIENCE

Status

Parent Application

Applicants

SYSTEMS BIOLOGY INDIA PVT. LTD.

401 BETA 1, GIGASPACE, VIMAN NAGAR, PUNE 411014,

Inventors

1. RAJEEV GANGAL

SYSTEMS BIOLOGY INDIA PVT. LTD, 401 BETA 1, GIGASPACE, VIMAN NAGAR, PUNE 411014,

Specification

FORM - 2
THE PATENTS ACT, 1970 (39 of 1970)
COMPLETE SPECIFICATION (Section 10, rule 13)

"Automated Molecular Mining and Activity Prediction using XML Schema, XML Queries, Rule Inference and Rule Engines"
Systems Biology India Pvt. Ltd.
with our Corporate office at 401 Beta 1, Gigaspace, Viman Nagar, Pune 411014,
Maharashtra, India.
an Indian Company registered under the provisions of the Companies Act, 1956,

FIELD OF THE INVENTION
This invention pertains to the interdisciplinary field of chemo-informatics and chemical structure-activity relationships (SAR) and more particularly to automating transformation of structural information for chemically, biologically or pharmacologically related molecules to a hierarchical schema of concepts and descriptors, discovering patterns in related schema and predicting biological activity using rules inferred from analyzing the patterns.
BACKGROUND
Informatics is increasingly driving scientific discovery. Bioinformatics and chemoinformatics are interdisciplinary informatics techniques that facilitate 'in-silico' experimentation in biology and chemistry respectively. These disciplines implement data mining algorithms to mine molecular data, macromolecular data and small molecules, respectively. Most algorithms originate from computer science and are applied to deciphering the function of proteins, DNA and small molecules. For example, graph-theoretical methods are used for calculating descriptors for organic molecules. Increasingly, bioinformatics and chemo-informatics algorithms are being used together in disciplines such as chemical biology.
Historically, biological data such as protein and DNA sequences, structures, microarray and proteomics data have been freely available, owing to open policies of worldwide biomedical institutions, such as the NCBI and/or the EBI. Chemical data has been generally proprietary and could be accessed as a paid service or product.
2

The advent of open databases such as PubChem (http://pubchem.ncbi.nlm.nih.gov/) has changed the dynamics of data access, so much so that many chemical suppliers are freely and increasingly submitting their data into PubChem. Some of these chemical data are linked to pharmacological and/or biological classes using the MeSH schema (the U.S. National Library of Medicine's controlled vocabulary used for indexing articles for MEDLINE/PubMed). There are several other databases that also link toxicological and other biological information with chemical structure. The information might be quantitative, e.g., minimum inhibitory concentration (MIC) values, or qualitative, e.g., "the molecule is hepatotoxic" or "the molecule is antiinfective."
Where the information is qualitative, care has been taken by the curators to follow a standard definition or threshold for determining when a molecule should be called active or toxic. The number of molecules in the PubChem database now exceeds 18 million. This enormous amount of chemical and biological data, while useful, raises an important data mining challenge of relating biological activities, e.g., toxicity, mechanisms of action, pharmacology, and adverse effects, to the structure of molecules. MeSH defines a hierarchy of biological, pharmacological concepts and is linked to some PubChem records. It is desirable to find all molecules linked to the different levels in MeSH and to mine chemical patterns that are common to them. Such common patterns are referred to as pharmacophores, biophores or toxicophores, depending on the activity under consideration.
A superimposition or alignment of 2D and/or 3D structures indicates geometrically conserved patterns. These are alignment-dependent pharmacophores, biophores or toxicophores, as the case might be. The limitation of this approach is that 2d graphs or 3d conformations are required. As the molecules diverge in structure so does the likelihood of
3

obtaining good alignments. Another -approach is to find 'maximum common substructures present in a given class of molecules. Graph-theoretic (Wiener index), topological (rings, atom counts) and physico-chemical properties such as molecular weight, polar surface area, and/or logP are also used. These descriptors are then related with classes of molecules with common activity. The problem common to most of these methods is that using a table to store descriptors loses the hierarchical relationships between the descriptors. Presence or absence of functional groups, atom types and rings is also used as a so-called "fingerprint" and some measure of distance between fingerprints of molecules is used to assess similarities. The similarities are then used for clustering and for inferring commonality of activity.
Thus, there is clearly a basic limitation to the above approaches. Chemists generalize molecules in terms of ring systems, functional groups and atom and bond types. All these concepts, especially functional groups are hierarchical in nature. A fragment common to all molecules might be aliphatic, alkane, etc. Most of the molecules might have a primary alkane fragment, while some others might have a secondary or tertiary alkane. However, conceptually the fragments are similar since they are all alkanes, only differing in specific types. This similarity is missed by fragment-count algorithms that rely on graph-matching techniques. Similarity search algorithms predefine a library of substructures of functional groups, ring systems and atoms and bonds. However, the 'similarity' between two molecules is quantified in terms of a mathematically defined distance between vectors of numbers representing them, which again does not delve into the hierarchical nature of domain knowledge. The issue is compounded when considering two connected substructures. While it is desirable to specify the exact molecular graph of the two molecular fragments, the likelihood that this connectivity will be conserved over many molecules in a class is very small. It is far more likely that the connection pattern, e.g., amine, primary amine connected
4

to a carbonyl group, carboxylic acid, will be conserved. Thus, the hierarchical nature of the domain representation can help in identifying extremely specific, as well as generic patterns at a higher level of abstraction.
While there have been some attempts to provide the facility of querying structure databases based on functional group and ring system hierarchies, the explicit intention of using optimal common hierarchical patterns to understand biological activity at a wide variety of levels has not been attempted. It is desirable, then, and an object of the invention, to provide improved approaches for automated data mining in the context of finding common, hierarchical patterns. Some previous automated methods for discovering and/or analyzing structure activity relationships have used manual ly-curated rule bases and expert systems, but have been dependent on specialized logic languages for inference. Manually curated rule bases have been in widespread use for several decades now, underscoring the simplicity and effectiveness of knowledge bases. One example is the DEREK for Windows, which has chemical alerts for hepatotoxicity, bacterial mutagenicity, genotoxicity and skin sensitization. In order to create a more efficient and accessible solution, however, there is a need for an approach for automatically generating a robust rule base in a method and system that can be implemented without dependence on specialized logic languages.
There is a need, then, for an improved system that can automate the process of rule discovery for a comprehensive class of activities and its subsequent storage and application to new molecules in the form of an expert system.
SUMMARY
5

The invention generally provides for transforming two dimensional structural coordinates of a set of chemically, biologically or pharmacologically related molecules to a hierarchical schema of concepts and descriptors. Further, according to the invention, patterns common to all molecules in a given class or clusters of molecules in the class can be extracted and stored, forming rules that relate hierarchical chemical features and concepts to biological, pharmacological or chemical activity. Such patterns can be stored as rules for matching with query molecules, thus indicating potential uses of the query molecules.
The invention further provides for a system and methods that can relate chemical structure to biological and pharmacological activities by transforming molecular structures to a hierarchical representation of chemical concepts and descriptors and detecting common tree like patterns.
Embodiments of the invention further provide for chemical concepts and descriptors such as functional groups, ring systems, atom and bond types and the distances between these entities to be defined in an XML schema, DTD or simple XML file. Sets of molecules belonging to a common pharmacological or biological activity can be referred to as a class or activity class. The XML template file can be used to transform a class of molecules with structural data to an XML file, reflecting the tree like structure of the template.
Preferred embodiments of the invention provide for a query performed on the output XML for a given class to give hierarchical patterns that are common to groups of molecules in the class. These common patterns can form rule sets for the given chemical, biological or pharmacological classification. The patterns can be common to a subset of molecules within a class and can form a sub-cluster of rules. Patterns can also be common at the leaf node of the
6

concept hierarchy or at any previous node. In a preferred embodiment, patterns common to more molecules and reaching terminal nodes are deemed of a higher importance as compared to rules derived from fewer molecules. Similarly, patterns conserved till the terminal nodes are more specific in nature, e.g., Primary Alkane, as compared to nodes near the root nodes, e.g., Alkane, and are thus more valuable in terms of specificity of the rule (refer to the ontolgies). One preferred embodiment provides for an algorithm that can find rules for binary data. A further preferred embodiment provides for an algorithm that can find rules for continuous, binary, one class and multi-class data The invention provides further for rules that are generated to be stored in a file system in XML and/or other formats, LDAP directory, relational database and/or a business rules engine, inter alia. According to at least one preferred embodiment, any such collection of rules can be referred to a RuleBase, irrespective of the method of rule storage. Further, the invention provides for inferring rules or patterns that are common to or distinct within any number of different biological classes and subclasses. Internal proprietary databases or public domain databases can form the chemical molecule structure and activity data input.
According to embodiments of the invention, by using the foregoing system and methods, a user can discover all potential classes of activities or confirm an existing hypothesis about a particular activity or class A preferred embodiment provides for constructing an integrated knowledge base of rules using all biological and functional classes, as defined in the NCBI MeSH browser (http://www.nim.nih.gov/mesh/) and using all pharmacological categories, as defined in PubChem (http://pubchem.ncbi.nlm.nih.gov/).
One embodiment of the invention provides for a method for discovering tree-like patterns common to a class of molecules, hereafter called "Rules", by using molecular
7

functional group, Ring systems and Atom Type concept hierarchies or ontologies. A 'class' refers to a set of molecules with common pharmacological, biological or chemical properties. Storage, execution and combination of Rules in groups related by virtue of a common class, in file systems e.g. XML, Rule Engines, LDAP directories and relational databases.
An embodiment further provides for employing the foregoing when the activity classes are arranged in a hierarchy or schema One embodiment of the invention provides for a method for clustering molecules on the basis of similarity between molecules as a function of the similarity between similar hierarchical patterns.
One embodiment of the invention provides for employing the above methods to find conserved hierarchical conceptual patterns in clusters of similar molecules rather than all molecules in a given class. Each cluster can lead to different sets of rules. Employing the foregoing methods, where the Class or the hierarchical concepts or descriptors have discrete and continuous values. Continuous values are discretized by binning into class intervals. The descriptors used (e.g., spectroscopic data), corresponding to different functional groups, rings and atom types are arranged in a hierarchical order.
Another embodiment provides for employing the foregoing methods, where the rule includes any equation between discretized class values and rule nodes and where the parameters of the equation are used for rule induction.
One embodiment of the invention provides for using a particular instance of the output of the above methods or the complete rulebase of the foregoing system and methods according to the invention for inferring all potential activities or confirming a particular
8

activity by forward and backward chaining in a rule engine, or performing Boolean queries on a relational database or similar schema.
A further embodiment provides for finding similarities between connectivities of functional groups, ring systems and atom types conserved in all or clusters of molecules.
An embodiment of the invention provides for finding bioisosteres by enumerating differences between functional groups, rings and atom types in the molecules, in a given class.
An embodiment provides for generating all chemically feasible molecular structures from molecular formulae of known drugs and drug like molecules and using a Rule base obtained from the foregoing methods and sytem to infer activities.
One embodiment provides for predicting biological activity at a higher biological level, i.e., activity against cell, tissue, organ, system, since drug targets are expressed in physiological states like diseases, symptoms and toxicity and prediction about activities at the drug-target level can be used according to the invention to automatically predict the activity at the higher biological levels.
A further embodiment of the invention provides for new molecular structures that match the rule for a given class to be generated computationally. These molecular structures may be generated using an exhaustive graph-theoretic methodology or using any evolutionary method. The invention provides for the generated molecules to always contain the patterns specified by the rules and the molecules may or may not exist previously in nature.
9

The invention further provides for embodiments of methods and sytems wherein the system is programming language, operating system and storage mechanism agnostic. While currently implemented in Java in one preferred embodiment, the system according to various embodiments can be implemented in a wide variety of programming languages, database systems, rule engines, and file systems, so long as the chief features of hierarchical domain knowledge, rule induction and application for many activity classes are followed.
At least one embodiment of the invention provides for separating the process steps for assembling domain knowledge or ontologies, transforming two-dimensional chemical structure data to this ontological form, inferring conserved hierarchical patterns in molecular classes and storage, and applying the rule base using rule engines, lightweight directory access protocol (LDAP), and relational databases.
BRIEF DESCRIPTION OF THE DRAWINGS
Fig. 1A illustrates a system with computer hardware and software according to an
embodiment of the invention.
Fig. IB illustrates database and software components and processing steps according to an
embodiment of the invention.
Fig. 2A illustrates an example of system architecture for a first module according to an
embodiment the invention.
Fig. 2B illustrates an example of a system architecture for a second module according to an
embodiment of the invention.
Fig. 2C illustrates an example of a system architecture for a third module according to an
embodiment of the invention.
Fig. 2D illustrates an example of a system architecture for a fourth module according to an
embodiment of the invention.
10

Fig. 2E illustrates connectivity between the Modules 1-4, according to an embodiment of the
invention.
Fig. 3 illustrates a test set of 1233 antibiotics in an exemplary implementation and case study
according to one embodiment of the invention.
Fig. 4 illustrates the 53 hits obtained after running the test set against the training set rules in
an exemplary implementation and case study according to one embodiment of the invention.
Fig. 5 illustrates the 35 hits cross checked for toxicity in an exemplary implementation and
case study according to one embodiment of the invention.
DETAILED DESCRIPTION
A preferred embodiment of the invention provides for system and methods for automating molecular mining and biological activity prediction, using XML schema, XML queries, rule inference and Rule Engines, wherein chemical structure can be related to biological and pharmacological activities by transforming molecular structures to a hierarchical representation of chemical concepts and descriptors (such as, for example, deriving a functional group schema for a set of molecules), building an XML file that is similar to the functional group schema, discovering causal links between functional groups or other ontologies and biological activity by detecting common tree-like patterns, creating a Rule Base of biological activities and functional group rules by based on the causal links, automating prediction of likely bioactivity of new molecules using a Rule Engine, RDBMS, and XML/XQuery together with the Rule Base, and generating constitutional isomers that have the same functional groups for a given biological activity. The invention can be further illustrated by the additional detailed descriptions of preferred embodiments provided below and by way of specific examples of software code components used to implement a preferred
11

embodiment of the system and methods. A preferred embodiment provides for working between node levels of the hierarchical tree-based description of the chemical structure of a molecule, where SAR relationships that pertain to different levels are being mined from the database and applied to the similarity data-mining and rule inference, so that rule development is based on more "relational" information (e.g., internal relationships, or relationships between internal molecular structure), rather than on simply strings, weighted strings or matrices of key fragments or descriptors.
Referring to Fig. 1 A, a preferred embodiment can provide for a computational system 5 comprised of computer hardware and software, more particularly a central processing unit 2, memory 4, graphic user interface 6, such as, for example, a computer monitor, a user input device 8, such as, for example, a keyboard, a mouse or other input device, computer bus 7, storage device(s) 9, such as hard disks, removable disks, network storage, or other storage devices, external data connectivity 3, such as, for example, Internet, Web, local area network, wide-area network, database 20, and software modules 100. Software 100 can include operating system software that can be stored on storage device 9 and loaded into computer memory 4 to control operation of the processor and to direct data within the system and to control other software modules. Software 100 can additionally include other software modules according to embodiments of the invention as will be described below. It will be appreciated that software 100 can be distributed in multiple locations within and outside the local elements of system 5, such as being distributed on external servers reachable through LAN and/or the Internet. It will be appreciated that the bus 7 can be wires and/or a combination of wire and wireless connectivity. Software 100 can include modules that can bring data from external data sources 3 and store them as part of database 20. It will be appreciated, therefore, that in various preferred embodiments database 20 can be considered
12

to include data sources 3. Database 20 can be stored in any manner of storage device 9 and/or can remain as distributed data stored in many locations and forms locally, or on removable media, or accessible through wired or wireless connections via the Internet, via satellite or other telephony signal.
For one preferred embodiment of the invention, Fig. IB illustrates some aspects of the interrelationship of system components, such as software 100 and database 20 with program software modules according to the invention and some of the method steps associated with the software operations. For example, software 100 can include rule induction engine 12, Rule Base (or Knowledge Base) 14, rule application engine 16 and output results 18, such as, for example, a resulting output of molecules with predicted activities. Input 10 can be an ontology and can further comprise an XML template. Input 10 can be stored in a database, such as database 20, which can include storage in distributed fashion accessible via the Internet.
Database 20 can include a relational database, or a RDBMS, filesystems, Internet or other sources and can include molecular structures, molecular activity data, biological data, biological activity data (or bioactivity data). It will be appreciated that the terms "activity", "biological activity" and/or "bioactivity" are used in this specification to describe any one or more aspects of the full range of pharmacological interactions, including pharmacokinetic activities and/or pharmacodynamic activities, and without limitation including adsorption, rate of distribution, volume of distribution, metabolism; excretion, half-life, receptor binding activity, receptor binding inhibition, specific and/or non-specific activities, specificity, toxicity, signaling disruption, modulation or mediation, and further including the movement, change, effect or other response, or lack thereof, of any one or more of the full range of
13

biological constituents and biological processes, including, without limitation, DNA, RNA, genes, chromosones, proteins, nuclei, mitochondria, cytoplasm, cell walls, biological pathways, cells, tissues, organs, enzymes, metabolism, serum, whole organisms, physiological state, degree of health, therapeutic index or margin, and any other aspect of biological structure, interaction and/or response.
Still referring to Fig. IB, database 20 can include any one or more of a full range of chemical and.or molecular descriptors, or chemical descriptive information, or parameters relating to characteristics of chemicals and/or molecules, relating to molecular structure and physical aspects of molecules, including, without limitation, number of atoms, type of atoms, atomic number, atomic weights, atomic relationships, electronegativity, excitation levels, valence state, activation information, atomic physics parameters, field strengths, total energy, enthalpy, electronic energy, heat of formation, entropy, repulsion energies, attraction energies, resonance characteristics, electrostatic characteristics, electron kinetic energy densities, energies of protonation, bonds, number of bonds, bond types, bond distances, bond angles, bond strengths, rings, ring structures, rotational stability, molecular wobble, molecular vibration, relative angles between ring planes, chirality, vertex properties, molecular parts, number of terminal atoms, functional groups, ligands, isomer characteristics, molecular size, molecular weight, molecular chain characteristics, molecular orientation, topology, substructural relationships, 2-dimensional structural formulae, 2-dimensional descriptive elements and/or 3- dimensional descriptive elements, stereochemistry, and/or any number of other types of information describing chemicals and/or molecules.
Additionally, database 20 can include any one of more of a full range of descriptors relating to chemical and/or molecular reactivity, interactivity and/or other aspects of physical chemical relationship between one molecule and another molecule, or between one molecule
14

and many other similar or different molecules, or between one group of many molecules and another group of many molecules of the same or different type, including, without limitation, electrochemical interactions, absorption, dissolution, repulsion, binding coefficient, specific activity, binding strength, crystallization parameters, melting point, molecular stability, association, dissociation, activity coefficients, activity constants, dissociation constants, pK, pKa, any number of chemical reactivity rate constants, density, solubility, and/or viscosity, inter alia. Continuing to refer to Fig. IB, at step 111 the rule induction engine 12 reads in data from input source 10, which can include XML template/ontology and at step 103 reads data from database source 20, which can include molecular structure and activity data. Processing within the rule induction engine can include, for example, without limitation, transformation steps, compound clustering steps, patterndiscovery steps, constraint adjustment steps and rule validation steps. At step 113 the rule induction engine 12 outputs a set of rules to a Rule Base 14 (which can also be termed a Knowledge Base). At step 105 the Rule Base can be written, stored and otherwise maintained and/or manipulated in the database 20. At step 115 the Rule Application Engine 16 addresses or reads from the Rule Base 14. Additionally, at step 107 the Rule Application engine 16 acquires from database source 20 ontology data related to molecular structure (such as, for example, XML Ontology including functional groups, ring systems, atom types) and a target set of molecular structures with unknown Activity Class Data, which can come from flat files, RDBMS, the Web and/or LDAP sources. The rule application engine 16 can perform, without limitation, steps such as predicting activity classes of unknown molecules and,generating,.based on constraints, new molecular structures using different scaffolds that can be predicted to have certain bioactivities. At step 117 the Rule Application engine 16 outputs the results 18, and at step 109 the results can be stored in the database 20, which as noted above, can include distributed
15

storage on the Internet, so that step 109 can include transmitting results to any number of a variety of destinations on the Internet for storage and/or further operations. Results 18 can include, without limitation, results of activity class prediction and/or new molecular structures with predicted bioactivities.
It will be appreciated that the interconnectivity of the hardware and software modules depicted in Fig. 1A and Fig. IB allow for ongoing, iterative processing, which can include machine learning, whereby writing of results into database 20 allows new information to be made available to the ontology source 10, the rule induction engine 12, the Rule Base 14 and to the Rule Application engine 16 in immediately subsequent cycles of processing.
The architecture of a further preferred embodiment of the invention can have several distinct modules. For example, Fig. 2A illustrates a system'architecture and processes of a first module, according to an embodiment of the invention. Referring to Fig. 2A, in one preferred embodiment, an input data file 10, such as, for example an XML ontology, can contain structural or other characteristic information about molecules, such as, for example, functional groups, ring systems and atom types, inter alia. A further source of input data 20 can include, by way of example and without limitation, molecular structure activity data, flat files, relational database management system (RDBMS), network data sources (e.g., Internet and/or World Wide Web), and/or LDAP. Input data 10 is read at step 211 and further source of input data 20 is read at step 203 as inputs to transformation engine 22, which transforms the data and produces at step 213 output data record(s) 24, which can be, for example, molecular XML ontology records. Note that data input step 211 and step 203 depicted in Fig. IB for an embodiment can correspond closely with data input step 111 and step-103,
16

respectively, depicted in Fig. 2A for an embodiment of the invention. Fig. 2B illustrates system architecture and processes of a second module,moduie, according to an embodiment of the invention. At step 311 data records 24 are read into a clustering engine 26, which can perform compound clustering based on pattern similarity, such as, for example, based on similar patterns seen in the hierarchical XML-tree structures. The procedure progresses at step 313 to include operation of a rule/conserved pattern discovery engine 28. At step 315 the Rule/Conserved Pattern Discovery Engine 28 can output to an output record 30, which can include, for example, outputs that display valid rules for entire class and individual clusters therein.
IF sufficient valid rules are generated for an entire class and clusters, then the process can END at step 317. If a sufficient set of valid rules is not generated, then the process can continue in step 319. In the rule validation component 32 rules are deemed non-trivial or valid if they contain at least three distinct nodes; for example, in a case of functional groups, an alkane, aromatic ring and carbonyl group would be three distinct nodes. If the rules are not valid, then the system can either relax the constraints and, in step 321, pass the process back to the Discovery Engine 28 or, in step 323, change the similarity threshold for cluster formation and pass the process back to the clustering engine 26 to update clustering.
The criterion for reclustering is that a valid rule must be found for every cluster of molecules for the given class and the number of singletons should be minimal. The number of singletons is a user defined criteria. If the Rules are valid, then the rule validation process can continue at step 325 to output the result to an output record 34, which can include, for example, an output record in the form of an addition to a Rule Base stored in or associated with a Rule Engine, RDMS, LDAP, and/or File System Storage. Note that step 325 and
17

output 34 depicted in Fig. 2B for an embodiment of the invention can correspond closely with step 113 and Rule Base 14, respectively, depicted in Fig. IB for an embodiment of the invention. After writing output to a file and/or Rule Base accessible to a Rule Engine, the process of this second module can end in step 327. Fig. 2C illustrates a system architecture and processes of a third module, according to an embodiment of the invention. The Rule Application Engine 16 can acquire at step 411 input data 10, which can include, for example, without limitation, XML ontology comprising functional groups, ring systems, atom types and/or other chemical descriptors, and can further acquire data from data storage 34 at data input step 413, which can include, for example, without limitation, data from rule-engine storage, Rule Base, RDBMS, LDAP, XML and/or file system storage.
Further, data 34 depicted in Fig. 2C can be the same, in various embodiments of the invention, as the data result 34 from the second module depicted in Fig. 2B. Continuing to refer to Fig. 2C, the Rule Application Engine 16 can further acquire, at step 415, additional data from a further source of input data 20, which can include, for example, without limitation, molecular structure activity data, flat files, relational database management system (RDBMS), network data sources (e.g., Internet and/or World Wide Web), and/or LDAP. At step 419, the Rule Application Engine 16 can pass data to an Activity Class Prediction component 36, which can output at step 421 an output result 40, which can include, without limitation, predicted activity classes that can be stored in a Rule Engine, Rule Base, RDBMS, LDAP, XML and/or file system storage, whereupon at step 423 this process path can end.
Additionally, the Rule Application Engine 16 can proceed through step 417 to generate constitutional isomers of the training set molecules or the test set molecules at output 38 and the rule engine 16 can then apply a further step 511 (see Fig. 2D) to select
18

isomers constrained to follow specific rules related to a class. Since the functional groups are the same but structure changes completely, new structures that may not be found in nature can be found by scaffold-hopping as an output result 42. The following example shows an input molecule in SMILES format that has anti-asthmatic activity. This is only one of the molecules from a set of anti-asthamatic molecules cited earlier in the document: 0(CCCCc 1 ccccc 1 )c 1 ccc(cc 1 )C(^0)Nc 1 cc2oc(cc(=0)c2cc 1 )c 1 n[nH]nn 1
The Constitutional isomer generation code then rearranges connections between atoms and bonds of the molecule to generate constitution isomers, i.e., molecules with same molecular formula but different structures. The output is 50 molecular structures as follows:

Thus, molecules that are structural isomers and also follow rules for anti-asthmatic bioactivity can be generated. This illustrates the functionality of the constrained molecular generator in a preferred embodiment. Another way to achieve a similar operation and outcome according to the invention can be to modify the isomergeneration routine to directly generate only those molecules that have the required functional group patterns. This functionality is very important in drug discovery: to obtain molecules that are bioactive and yet sufficently different structurally from patented molecular structures; therefore, the software system and methods of the invention provide a substantial advantage to researchers and the economics of drug discovery.
Fig. 2D illustrates a system architecture and processes of a fourth module, according to an embodiment of the invention. Constrained structures 38, which can be generated as output from a rule application engine can provide input to a further component of the rule application engine that can create further New Molecular Structure/Scaffold output 42, these being constitutional isomer two-dimensional structures for one or more new molecular structures based on the constraints, which new structures can have different scaffolds. When the New Molecular Structure/Scaffold output 42 has been completed then this branch of the
23

rule application engine process can end at step 513. An example of the. code element(s) for this operation are provided herein (see files "MoleculeTableJava.txt" and "ONTO-001xq800-CODE APPENDIX-15.txt" in CODE APPENDIX CD-R, Copy 1), along with the the SMILES output, as discussed further, below, under the section heading "Constrained Structure Generator."
Fig. 2E illustrates connectivity between the four modules described above in Figs. 1A-1D, wherein the numbered elements and steps have the same meaning between the figures, such that the corresponding description from Figs, 1A-1D is incorporated here by reference for description of Fig. 2E. The system according to a preferred embodiment can run on any modern 32-BIT or 64-BIT computer. Preferably the computing system can run Java(TM) 1.5 or higher. Preferably, the system has at least 512 MB of RAM. According to one preferred embodiment, additional features of the system can include:
(a) XML schema/DTD/XML, to represent chemical concept hierarchies such as functional groups, rings, atomic types and their interconnections;
(b) An xpath/ xquery/ xml transformation engine that translates molecular structures to xml records, using the schema from system component (a), above;
(c) A clustering engine to cluster subsets of molecules based on similarity of their schema. This enables better rule discovery since only similar molecules are used for rule discovery;
(d) An xml/ xpath/ xquery conserved pattern or rule discovery engine to find hierarchical patterns, common to a class or cluster of molecules, including a rule module, to jnsert common patterns as WHEN... THEN or IF...THEN rule sets into a rule engine or relational database;
(e) Components and methods for manual and/or automated validation of rules based on external information or user expertise;
24

(f) A rule base or knowledge-base;
(g) A Rule application engine, to predict potential activity classes for new molecules in
proprietary and public databases, based on rules in the knowledge base; and/or
(h) Components and methods for Constrained Molecular structure generation based on rules for activity class.
A preferred embodiment of the invention does not use logic languages to facilitate the data representation, transformation and rule induction. A set of molecules with 2D structural information belonging to a particular activity class, can form an input to a system according to one embodiment. This input can be a file, a query to an online/local database or to a web service or rss feed, or the parsed information from a web query, among other sources. The class is generally nominal, such as, for example, anti-cancer, hepatotoxic, or other bioactivity.
Numerical but discrete classes can be transformed to nominal ones by defining intervals and allocating a class name to each interval. SDF, MOL2, SMILES, XYZ, CML and other widely used molecular formats can be used, so long as the two-dimensional connectivity information about atoms and bonds is present or can be reconstructed. According to various embodiments, there is no particular requirement for any of the modules to be located solely in the client, middleware or server part of a computing system and/or network. Depending on the implementation, the various modules can occur in different places in the system. As a general practice, the more computationally intensive modules described in the examples herein are preferably implemented on the server side. The client part of the system can generally deal with input/output and sketching the molecules for entry into the system.
EXAMPLE ~ Case report
25

To further illustrate the predictive advantages of the method and system according to the invention, a program (named "Ontomine") that predicts toxicity and bioactivity of compounds based on the functional groups, using CNS toxicity prediction is presented as an Example Two case study.
Methods
A training set of 49 drugs with known toxicity against the Central Nervous System was used to obtain functional group patterns indicating CNS toxicity. Ontomine transforms the input molecules to XML reflecting the functional group schema and then mines pattens common to 23 subclusters formed during the clustering stage. The rules can be as follows, shown in Table 3, below Table 3. Rules for an embodiment of the present Case Study. Cluster 1 (Alkane:Secondary[2]) AND (Benzenering[l]) AND (Amine[l]) Cluster 2
(Alkane:Secondary[2]) AND (Benzenering[ 1 ]) AND (Amine:Tertiary[l]) Cluster 3
(Alkane:Secondary[2]) AND (Benzenering[2]) AND (Amine:Tertiary[l]) Cluster 4
(Alkane:Secondary[2]) AND (Benzeneringfl]) AND (Amine:Tertiary[l]) AND (Alcohol[l]) AND (Ether[l]) Cluster 5
(Alkane:Secondary[2]) AND (Alkene[l]) AND (Amine:Primary[l]) AND (Carbonyl:CarboxylicAcidDerivative:CarboxyIicAcid[l]) Cluster 6
26

(Alkane:Primary[4]) AND (Amine:Tertiary[2]) AND (Disulfide[l]) AND
(SulfenicDerivative[2]) AND (Thiocarbonyl[2])
Cluster 7
(Alkane:Secondary[l]) AND (Aniline[2]) AND (Benzenering[2]) AND
(Amine:Tertiary[3]) AND (SulfenicDerivative[l])
Cluster 8
(Alkane:Secondary[l]) AND (Aniline[2]) AND (Benzenering[2]) AND
(Amine:Tertiary[2]) AND (SulfenicDerivative[l])
Cluster 9
(Alkane[4]) AND (Benzenering[2]) AND (Amine:Secondary[l]) AND
(Carbonyl[l]) AND (ArylHalide:ArylChloride[2])
Cluster 10
(:Benzenering[2]) AND (Amine:Tertiary[l]) AND
(Iminyl:ketimine:Secondary[l]) AND (Lactamfl]) AND (Carbonylfl])
AND (ArylHalide:ArylChloride[l])
Cluster 11
(Alkane:Secondary[4]) AND (Aniline[2]) AND (Benzenering[2]) AND
(Amine:Tertiary[2]) AND (SulfenicDerivative[l])
Cluster 12
(Alkane:Primary[4]) AND (Alkane:Secondary[6]) AND
(Alkane:Tertiary[2]) AND (Alkane:Quartary[3]) AND (Benzeneringfl])
AND (Phenol[l]) AND (Amine:Tertiary[l]) AND (Alcohol:Tertiary[l])
AND (Ether[2])
Cluster 13
(Aikane:Secondary[2J) AND (Benzenering[l]) AND
27

(Amine:Tertiary[l])
Cluster 14
(Alkane:Primary[2]) AND (Alkane:Secondary[4]) AND
(Alkane:Tertiary[ 1 ]) AND
(Carbonyl:CarboxylicAcidDerivative:CarboxylicAcid[ 1 ])
Cluster 15
(:Benzenering[l]) AND (Amidine[2]) AND (Amine:Secondary[2]) AND
(Guanidine[l]) AND (ArylHalide:ArylChloride[2])
Cluster 16
(Alkane:Primary[l]) AND (Alkane:Secondary[6j) AND
(Alkane:Tertiary[l]) AND (Benzenering[l]) AND (Oxoarene[l]) AND
(Amine:Tertiary[l]) AND (Lactam[l]) AND (ArylHalide:ArylFluoride[l])
Cluster 17
(Alkane:Primary[4]) AND (Alkane:Secondary[4]) AND
(Alkane:Tertiary[3]) AND (Alkene[l]) AND (Benzenering[l]) AND
(Amine:Secondary[1]) AND (Amine:Tertiary[3]) AND
(Amide:Secondary[l]) AND (Lactam[2]) AND (Carbonyl[3])
Cluster 18
(:Benzenering[2]) AND (Iminoarene[l]) AND (Amine:Secondary[l])
AND (Amine:Tertiary[2]) AND (Enamide[l]) AND
(ArylHalide:ArylChloride[l])
Cluster 19
(Alkane:Secondary[2]) AND (Benzenering[l]) AND
(Amine:Secondary[l]) AND (Amine:Tertiary[l]) AND (Carbamate[l])
AND (Urethane[l]) AND (Carbonyl[l])
28

Cluster 20 (:Alkene[l]) AND (Benzenering[3]) AND (Amine:Tertiary[2])
Cluster 21
(:Benzenering[2]) AND (Amidine[i]) AND (Amine:Secondary[l]) AND
(Amine:Tertiary[l]) AND (Etherfi]) AND (ArylHalide:ArylChloride[l])
Cluster 22
(:Benzenering[2]) AND (Amine:Secondary[2]) AND (Imide[l]) AND
(Urea[ 1]) AND (Carbonyl[2])
Cluster 23
(Alkane:Secondary[l]) AND (Benzenering[l]) AND (Phenol[2]) AND
(Amine:Primary[l]) AND
(Carbonyl :Carboxy 1 ic AcidDerivative :Carboxy I ic Ac id[ 1 ])
The test set consisted of 1,233 antibiotics from PubChem. These were then run against the training set; that is, each molecule of the 1,233 antibiotics was individually screened against all the 23 clusters. This resulted in 35 unique hits.
None of the hits were present in the original training set.
Fig. 3 depicts about 1,233 Antibiotics that form a test set, out of which 35 molecules were predicted to have CNS toxicity . The toxicity and activity of these 35 molecules was checked in PubChem, PubMed, DrugBank, TOXNET and Google. Fig. 4 shows 53 hits obtained after running the test set against the training set rules.
Case Study Conclusion
29

Fig. 5 shows the 35 hits cross-checked for toxicity With PubChem annotation Pubmed medical abstracts and available reference information from google. Toxicity information was available for nine out of the 35 predicted molecules. Out of these nine compounds, six were indeed found to be toxic to the nervous system. The remaining compounds were annotated as cytotoxic, cardiotoxic and toxic to reproductive cells and to the eye. There was no evidence to indicate that these were not CNS toxins. In general, further experiments would be required to rule out CNS toxicity for the 29 compounds flagged by the software.
The case study of Example Two clearly shows the value of the preferred embodiment comprising the Ontomine program in predicting toxicity by using simple conserved hierarchical functional groups. Usage of such rules in expert systems will aid drug discovery companies and regulatory authorities in prioritizing molecules for toxicity testing. This will substantially reduce the cost associated with drug discovery by identifying probable toxicities at a much earlier stage. Ontomine finds simple conserved functional group patterns that indicate the propensity for bioactivity. The current study showed that the simple rules output was very good at identifying CNS toxins. The rules are clearly understandable by the end user and can help in better drug design for maximizing therapeutic activity and minimizing the chance of toxicity that leads to regulatory failure.
ADVANTAGES
The methods according to preferred embodiments of the invention are important when trying to analyze and dicover the diverse nature of molecules that have a similar biological effect. Mining patterns common to many such biological levels as defined in ontologies such as MeSH and finding common chemical patterns, e.g., counts of functional groups at
30

different levels of the functional groups hierarchy, enables construction of a dynamic structure-activity class knowledge base. Such knowledge bases can rapidly identify potential uses and warning signs for any molecule. Relational database systems, LDAP and XML, previously used for data storage, have now matured as informatics technologies and can be used advantageously according to the invention to store the patterns common to molecular classes. These patterns, when stored in a Rule Engine as rules, can form a Rule Base (the terms 'Rule Base' and 'knowledge base' are considered equivalent herein). These rules can then be applied as queries to newer molecules and can predict the activity class. A set of many such patterns is a Knowledge Base, relating structures to activities.
According to preferred embodiments of the invention, rules derived by the system and methods of the invention can be interpreted as non-alignment related pharmacophores, biophores or toxicophores, depending on the original dataset. The methods and system of invention can be used for finding potential uses of new molecular structures or potential problems (such as, for example, toxicity) prior to synthesis and screening using high throughput technologies. Drug discovery project managers can use the methods and system of invention to benchmark the probability of the success of the hit screening programs with reference to historical chemical trends. According to the invention, regulatory agencies using structure activity programs and alert systems for identifying toxicity and adverse effects can use the present methods and system to help define such alerts by means of the rule sets created. Medicinal and computational chemists can use the methods and system of invention for selecting molecules for High Throughput Screening or selecting and designing molecules likely to possess a particular activity.
31

The time performance of the present system provides significant advantages to researchers. The system and methods according to various embodiments of the invention preferably take on the order of 20-200 seconds to discover conserved patterns in a set of approximately 5-100 molecules with clustering, more preferably an embodiment can perform so that if takes on the order of 20-100 seconds to discover conserved patterns in a set of approximately 20-80 molecules with clustering, and even more preferably an embodiment can perform so that the system can take on the order of 20-25 seconds to discover conserved patterns in a set of approximately 50 or more molecules with clustering. When the rules for prediction are applied, the system can preferably predict activity in the range of 10-100 molecules per minute, and more preferably can predict activity of in the range of 20- 80 molecules per minute, and even more preferably can predict activity of approximately 50 or more molecules per minute. A preferred embodiment has essentially no upper limit on the number of structures for which the system can make predictions; a preferred embodiment can handle millions of molecules in a suitable time frame.
32

We claim:
1. A method for analyzing the relationship of molecular structure and biological activity in one or more molecules, comprising the steps of transforming molecular structure data into a hierarchical representation of chemical concepts and descriptors; and detecting common tree-like patterns in the data.
2. The method of claim 1, comprising the further step of defining distances between at least one of functional groups, ring systems, atoms, bond types, chemical concepts, chemical fragments, and chemical descriptors in an XML schema, DTD or simple XML file.
3. The method of claim 2, comprising the further step of grouping at least one set of molecules having structural data belonging to a common pharmacological or biological origin into at least one class, and transforming the at least one class formed from the at least one set of molecules having structural data into a resultant XML file.
4. The method of claim 3, wherein the step of transforming the at least one class is accomplished by using an XML template file having a tree-like structure and the resultant XML file repeats the tree-like structure of the XML template file once forrevery molecule.
5. The method of claim 4, comprising the further step of querying the resultant XML file, based on at least one given chemical, biological or pharmacological classification, to produce hierarchical patterns common to at least one group of molecules in the at least

one class, and generating at least one rule set for the at least one given chemical, biological or pharmacological classification.
6. The method of claim 5, including generating at least one rule set having a confidence level and salience that are proportional to the percentage of records and the depth of the tree to which they are conserved.
7. The method of claim 5, including the further step of finding rules for continuous, binary, one class or multi-class data.
8. The method of claim 5, including the further step of storing the generated rule set in a business rules engine or in a database.
9. The method of claim 5, including the further step of inferring rules or patterns common to or distinct within a plurality of different biological classes or subclasses.

10. The method of claim 5, including the further step of constructing an integrated knowledge base of rules using all biological and functional classes as defined in the NCBI MeSH browser, PubChem pharmacological classes at different levels of activity including at least one of drug target level, biological process level ,therapeutic, level,r disease level, clinical indication, and syndrome level.
11. A computer-implemented method for finding rules common to a set of molecules having common pharmacological, biological or chemical properties, comprising the steps of defining at least one object class of molecules based on at least one set of molecules

having common pharmacological, biological or chemical properties, discovering tree-like patterns common to the at least one object class of molecules, through use of conceptual hierarchies or ontologies formed from molecular functional group, ring system, or atom-type descriptors; and inferring rules from the tree-like patterns.
12. The method of claim 11, comprising the further steps of building groups of rules based on combining different classes of molecules, storing, executing and combining the groups of rules in file systems, XML, rule engines, LDAP directories or relational databases.
13. The method of claim 11 or claim 12, wherein the biological activities are arranged in a hierarchy or schema resembling the hierarchy of MeSH.
14. A method for clustering at least one subset of molecules in a group of molecules on the basis of similarity between molecules in the group, comprising the steps of deriving hierarchical patterns from a representation of at least two molecules in the group;finding similarity between the hierarchical patterns; and clustering the at least one subset of molecules based on the similarity between hierarchical patterns.
15. The method of claim 14, comprising the further steps of finding conserved
hierarchical conceptual patterns in clusters of similar molecules rather than in all
molecules in at least one class of molecules and finding sets of rules based on the step of
clustering, wherein each cluster leads to different sets of rules.

16. The method of claim 11, wherein at least one of the object class, the conceptual hierarchies and the descriptors both have discrete and continuous values, and comprising the further steps of discretizing the continuous values by binning into class intervals; and arranging the descriptors corresponding to different functional groups, rings and atom types in a hierarchical order.
17. The method of claim 16, wherein the descriptors are spectroscopic data.
18. The method of claim 15, wherein at least one of the sets of rules includes an equation formulaic relationship between discretized class values and rule nodes and where the parameters of the equation are used for rule induction.
19. The method of claim 11, comprising the further step of applying the inferred rules to subsequently infer all potential activities or to confirm a particular activity by forward and backward chaining in a rule engine, or to subsequently perform Boolean queries on a relational database or similar schema.
20. The method of claim 12 or claim 13, comprising the further step of applying the inferred groups of rules to subsequently infer all potential activities or to confirm a particular activity by forward and backward chaining in a rule engine, or to subsequently perform Boolean queries on a relational database or similar schema. -
21. The method of claim 14, comprising the further.step of finding similarities between connectivities of functional groups, ring systems and atom types conserved in all or clusters of molecules.

22. The method of claim 11 or 14, comprising the further step of finding bioisosteres by enumerating differences between functional groups, rings and atom types in the molecules, in a given class.
23. The method of claim 11, 12 or 13, comprising the further step of generating all chemically feasible molecular structures from molecular formulae of known drugs and drug-like molecules, and inferring activities from the rules or from groups of rules for these molecular structures.
24. The method of claim 23, comprising the further step of obtaining a predicted activity or activities at the drug target level,.wherein the drug target is expressed in the context of a physiological state such as a disease, a symptom or a toxicity, automatically predicting the activity at a cellular level, tissue level, organ level, or system level based on the predicted activity or activities at the drug target level.
25. The method of claim 23, comprising the further step of generating computationally new molecular structures that match the rule for a given class using an exhaustive graph theoretic methodology or using an evolutionary method, wherein the generated molecules contain the patterns specified by the rules.
26. A method for automated molecular mining and activity prediction, comprising the steps of deriving for a set of molecules a functional group schema; building an XML file that is similar to the functional group schema; finding patterns in the similarity of the XML to the functional group schema that are conserved for a given set of molecules;

writing the conserved patterns to a results file; and inferring biological activity from the conserved patterns.
27. A method for automated molecular mining and activity prediction, comprising the steps of deriving for a set of molecules a functional group schema; building an XML file that is similar to the functional group schema; discovering causal links between functional groups or other ontologies and biological activity creating a Rule Base of biological activities and functional group rules based on the causal links automatically predicting likely bioactivity of new molecules using a Rule Engine, RDBMS, and XML/XQuery together with the Rule Base generating constitutional isomers that have the same functional groups for a given biological activity.
28. A system for analyzing the relationship of molecular structure and biological activity in one or more molecules, comprising a transformation engine software module a discovery engine software module a prediction engine software module an isomere generation software module.
29. The system of claim 28, further comprising XQuery, XML and Java code operable on a computing platform for parsing data input files and transforming the input data into XML schema and ontologies.
30. The system of claim 28, further comprising A molecule reader module operable for parsing data input files that contain molecular information.5

31. The system of claim 28, further comprising an SDF-to-XML converter software component module operable for converting SDF formatted data to XML schema.
32. The system of claim 28, further comprising a table-building software component module for displaying molecules in a Graphical User Interface.
33. The system of claim 28, further comprising a Rule Engine operable for forward chaining or for backward chaining and having a pattern matching algorithm a Rule Base containing a plurality of rules for biological activities, wherein the Rule Base is interoperably coupled to or resident within the Rule Engine

34. The system of claim 33, wherein the Rule Base comprises more than 500 rules.
35. A computer system comprising: a central processing unit;
a user interface; and
a main memory having an operating system that supports a plurality of object
oriented classes, wherein the classes provide at least one data-input object
a transformation engine software module interoperable with the data-input object,
a discovery engine software module interoperable with output files from the
transformation engine, and
a prediction engine software module for predicting activities that is interoperable with
output files from the discovery engine.

36. The computer system of claim 35, wherein the at least one data object comprises an
XML schema representing functional groups in a set of molecules.
37. A computer-implemented, ontology-based automated data-mining and
activityprediction system, comprising:
a computer having a processor, a user interface, an input means, a memory and an
operating system capable of running software,
a first storage means for storing chemical structure data;
a second storage means for storing transformed hierarchical data;
at least one rule base comprising a plurality of rules;
transformation means for transforming chemical structure data in the first storage means
into transformed hierarchical data stored in the second storage,means,
a pattern discovery means for discovering patterns from analysis of the transformed
hierarchical data and the rules
a prediction means for searching against the rule base and returning prediction results;
and display means for displaying the returned search results.
38. A Rule Base for predicting biological activity of molecules based on structureactivity
relationships, wherein the rule base is produced by a computer-implemented method for
finding rules common to a set of molecules having common pharmacological, biological
or chemical properties, comprising the steps of defining at least one object class of
molecules based on at least one set of molecules having common pharmacological,
biological or chemical properties, discovering tree-like patterns common to the at least
one object class of molecules, through use of conceptual hierarchies or ontologies formed

from molecular functional group, ring system, or atom-type descriptors; and inferring rules from the tree-like patterns.
39. The method of claim 12, wherein the different classes of molecules distinguish molecules that are anti-tubercular but not anti-fungal from sets of molecules that are anti-tubercular and anti-fungal.
Dated this 19th June, 2008

For Systems Biolpgy India/Pvt. Ltd

(Mr. Juha Saharinen)

Documents

Application Documents

#	Name	Date
1	1293-MUM-2008- PUBLICATION REPORT.pdf	2022-04-29
1	1293-MUM-2008-CORRESPONDENCE(IPO)-(AB 21)-(14-10-2015).pdf	2015-10-14
2	1293-mum-2008-abstract.pdf	2018-08-09
2	FORM9.TIF	2018-08-09
3	abstract1.jpg	2018-08-09
3	1293-mum-2008-claims.pdf	2018-08-09
4	1293-MUM-2008_EXAMREPORT.pdf	2018-08-09
4	1293-MUM-2008-CORRESPONDENCE(IPO)-(8-10-2013).pdf	2018-08-09
5	1293-MUM-2008-FORM 9(20-6-2008).pdf	2018-08-09
5	1293-MUM-2008-CORRESPONDENCE(IPO)-(FER)-(30-9-2014).pdf	2018-08-09
6	1293-mum-2008-form 3.pdf	2018-08-09
6	1293-mum-2008-correspondence.pdf	2018-08-09
7	1293-mum-2008-form 2.pdf	2018-08-09
7	1293-mum-2008-description(complete).pdf	2018-08-09
8	1293-mum-2008-form 2(tittle page).pdf	2018-08-09
8	1293-mum-2008-drawing.pdf	2018-08-09
9	1293-mum-2008-form 1.pdf	2018-08-09
9	1293-mum-2008-form 18.pdf	2018-08-09
10	1293-mum-2008-form 1.pdf	2018-08-09
10	1293-mum-2008-form 18.pdf	2018-08-09
11	1293-mum-2008-drawing.pdf	2018-08-09
11	1293-mum-2008-form 2(tittle page).pdf	2018-08-09
12	1293-mum-2008-description(complete).pdf	2018-08-09
12	1293-mum-2008-form 2.pdf	2018-08-09
13	1293-mum-2008-correspondence.pdf	2018-08-09
13	1293-mum-2008-form 3.pdf	2018-08-09
14	1293-MUM-2008-CORRESPONDENCE(IPO)-(FER)-(30-9-2014).pdf	2018-08-09
14	1293-MUM-2008-FORM 9(20-6-2008).pdf	2018-08-09
15	1293-MUM-2008-CORRESPONDENCE(IPO)-(8-10-2013).pdf	2018-08-09
15	1293-MUM-2008_EXAMREPORT.pdf	2018-08-09
16	1293-mum-2008-claims.pdf	2018-08-09
16	abstract1.jpg	2018-08-09
17	1293-mum-2008-abstract.pdf	2018-08-09
17	FORM9.TIF	2018-08-09
18	1293-MUM-2008-CORRESPONDENCE(IPO)-(AB 21)-(14-10-2015).pdf	2015-10-14
18	1293-MUM-2008- PUBLICATION REPORT.pdf	2022-04-29