Universal Data Transformation Tool.

< Back

Universal Data Transformation Tool.

Abstract: The present invention relates to a system and method for transforming a legacy data format into a standardized data in compliance to a target data standard using a data transformation tool. Further, the invention provides the method for enabling the said data transformation tool for automatically and dynamically generating a conversion program, wherein the said conversion program is generated on the basis of mapping specifications between multiple source schemas and target schema.

Get Free WhatsApp Updates!
Notices, Deadlines & Correspondence

Patent Information

Application #

Filing Date

21 February 2012

Publication Number

12/2014

Publication Type

INA

Invention Field

COMPUTER SCIENCE

Status

Parent Application

Applicants

TATA CONSULTANCY SERVICES LIMITED

NIRMAL BUILDING, 9TH FLOOR, NARIMAN POINT, MUMBAI 400021, MAHARASHTRA, INDIA.

Inventors

1. NEMA, BABU SHASHIKUMAR

TATA CONSULTANCY SERVICES LIMITED, 18, SJM TOWERS, SHESHADRI ROAD, GANDHINAGAR, BANGALORE 560 009, KARNATAKA, INDIA

2. SUBRAMANYA, RAGHAVENDRA BASRUR

TATA CONSULTANCY SERVICES LIMITED, 69/2, 69/3, JAL BUILDING, SALARPURIA GRTP, WHITEFIELD ROAD, BANGALORE 560 066, KARNATAKA, INDIA

3. THOMAS, BIJU KAYYALAKKAL

TATA CONSULTANCY SERVICES LIMITED, 69/2, 69/3, JAL BUILDING, SALARPURIA GRTP, WHITEFIELD ROAD, BANGALORE 560 066, KARNATAKA, INDIA

4. NAGARJUNA, IRAGAM

TATA CONSULTANCY SERVICES LIMITED, 69/2, 69/3, JAL BUILDING, SALARPURIA GRTP, WHITEFIELD ROAD, BANGALORE 560 066, KARNATAKA, INDIA

Claims

1. A computer-implemented method for transforming legacy data into standardized data in compliance to a target data standard, wherein the said method comprises of: a. importing metadata corresponding to source schema; b. importing at least one data standard corresponding to target schema; c. mapping the imported metadata of the said source schema with the imported data standards of the said target schema for defining mapping specifications using graphical user interface based on built-in historical intelligence and weighted metadata; d. identifying closest target variable matches based on the built-in historical intelligence and weighted metadata upon mapping the imported metadata of the said source schema with the data standards of the said target schema; e. automatically generating a conversion program using the defined mapped specification for transforming the source schema into the target schema, the said target schema is the said standardized data generated in compliance to the target data standard; and f. Validating the transformed standardized data against the said target data standard.

2. The method as claimed in claim 1, wherein the target data standard includes data standard names and their respective versions, code, data set name, data set code, designation, variable name, label, type, length, data standards required by companies and organizations and associating code lists to the target data variables and combination thereof.

3. The method as claimed in claim 1, wherein the mapping specifications includes concatenation, definition of expressions, 1:N variable mapping, code list mapping, mapping history, autosuggest and combination thereof.

4. The method as claimed in claim 1, wherein the transformation module is modularized to be re-used across multiple source schemas.

5. A system for transforming legacy data into standardized data in compliance to a target data standard, wherein the system comprises of: a. an importing module configured for importing metadata of source schema and further configured for importing and managing at least one data standard of target schema; b. a mapping module configured for defining mapping specifications upon mapping the imported metadata of the said source schema with the imported data standards of the said target schema using graphical user interface based on historical intelligence and weighted metadata; c. an identification module configured for identifying closest target variable matches on the basis of the built-in historical intelligence and weighted metadata upon mapping the metadata of the source schema with the data standards of the target schema; d. a transformation module configured for automatically generating a conversion program using the defined mapped specifications, the said conversion program is further configured for transforming the source schema into the target schema, the said target schema is the said standardized data generated in compliance to the target data standard; and e. a validation module configured for conducting validation of the transformed standardized data against the said target data standard.

6. The system as claimed in claim 5, wherein the target data standard includes data standard names and their respective versions code, data set name, data set code, designation, variable name, label, type, length, data standards required by companies and organizations and associating code lists to the target data variables and combination thereof.

7. The system as claimed in claim 5, wherein the mapping specifications includes concatenation, definition of expressions, 1:N variable mapping, code list mapping, mapping history, autosuggest and combination thereof.

8. The system as claimed in claim 5, wherein the generated transformed module is capable of being re-used across multiple source schemas.

Specification

FORM 2
THE PATENTS ACT, 1970
(39 of 1970)
&
THE PATENT RULES, 2003
COMPLETE SPECIFICATION
(See Section 10 and Rule 13)
Title of invention: UNIVERSAL DATA TRANSFORMATION TOOL
Applicant
TATA Consultancy Services Limited A company Incorporated in India under The Companies Act, 1956
Having address:
Nirmal Building, 9th Floor,
Nariman Point, Mumbai 400021,
Maharashtra, India
The following specification particularly describes the invention and the manner in which it is to be performed.

FIELD OF THE INVENTION:
The invention generally relates to a field of data transformation.
BACKGROUND OF THE INVENTION:
In today's era, handling and managing data and its standardization is a complex problem. Data contents are the most valuable assets for any organization. They contain the confidential and useful information of the organization. Data contents are available in different formats and are stored in different types of databases. In the field of research, business or otherwise the data contents are stored across plurality of computers or other data storage units. These data contents are further used for analysis upon which business forecasts, business plans, business strategies and decisions are made. To efficiently conduct the data analysis, it is essential to standardize the data in the desired format.
Standardization of data is important for organizations to ensure the uniformity of data, both within the organization as well as among authorities and consortia in different fields of research and businesses. Thus it is a critical business requirement to bring the data in a standardized format, wherein data from multiple source schemas is converted into a desired target data standard without losing the information contained in the said data. Conversion of data from one format to another format is done using data transformation techniques.
In the currently known techniques, the data transformation into one format from another is being facilitated by manually developed conversion programs. These programs require lot of manual intervention as well as time intensive and skill based coding. Further, the said manually developed conversion programs also require manual or semi-automated reviews that are prone to errors.
Hence, the lack of an automatic and dynamic development of data transformation programs that are capable of handling multiple source schemas still remains an

unaddressed need in the art. Moreover, yet another limitation associated with the data transformation programs is that they need to be developed every time for the respective source schema depending upon their data type, data format, etc. So, there is a long felt need for a system and method which is capable of understanding the limitations and requirements of source and target schema respectively and adapts itself to the multiple source schemas in order to automatically and dynamically generate a computer program which is used for data transformation of a particular source and target schema.
OBJECTS OF THE INVENTION
The primary objective of the present invention is to provide a system and method that automatically and dynamically generates a conversion program based on mapping specifications between multiple source schemas and a target schema.
Another object of the present invention is to enable a method for importing metadata of the multiple source schemas.
Yet another object of the present invention is to enable a method for importing and managing data standard(s) of the target schema.
Yet another object of the present invention is to enable a method for understanding, accepting and applying the limitations and requirements of the multiple source schemas and the target schema respectively.
Another object of the present invention is to enable a method for defining mapping specifications upon mapping metadata of the multiple source schemas with data standard(s) of the target schema.
Yet another object of the present invention is to automatically generate a conversion program that transforms the metadata of the multiple source schemas into a standardized data in compliance to target data standard requirements.

Yet another object of the present invention is to enable a user to perform a graphical user interface (GUI) based mapping between the metadata of the multiple source schemas with the data standard(s) of the target schema using historical mapping information and built-in-intelligence.
Yet another object of the present invention is to enable a method and system that updates the built-in-intelligence by adding the previously conducted mappings of the multiple source schemas with target schema within the memory of the system.
Yet another object of the present invention is to enable a method for validation of the standardized data against the target data standard.
Yet another object of the present invention is to provide the method and system for enabling the auto-generated conversion program to be re-used across multiple source schemas.
Yet another object of the present invention is to provide a computer-implemented system having several modules that enable a processor to import, understand, transform and validate the imported metadata and data standard(s) from different sources upon execution.
SUMMARY OF THE INVENTION
Before the present system and method, enablement are described, it is to be understood that this invention is not limited to the particular system, and methodologies described, as there can be multiple possible embodiments which are not expressly illustrated in the present disclosure. It is also to be understood that the terminology used in the description is for the purpose of describing the particular versions or embodiments only, and is not intended to limit the scope of the present invention.
The present invention introduces a computer-implemented system and computer-implemented method for providing a tool configured for automatically and dynamically

generating a conversion program for transforming multiple source schemas into a single standardized data in compliance to a desired target data standard.
In an aspect of the invention a system is provided for transforming legacy data into the standardized data in compliance to the target data standard, wherein the system comprises of various program modules executed by a processor including an importing module which is configured for importing metadata of the source schema and further configured for importing and managing the data standard(s) of the target schema. The said data standard(s) may include different versions of the said data standards which may be imported from different external sources. Further, a mapping module configured for defining the mapping specifications upon mapping the imported metadata of the said source schema with the imported data standard(s) of the said target schema using graphical user interface (GUI) based on historical intelligence and weighted metadata; an identification module configured for identifying closest target variable matches on the basis of the built-in historical intelligence and weighted metadata upon mapping the metadata of the source schema with the data standard(s) of the target schema; a transformation module configured for automatically generating the conversion program using the defined mapped specifications, the said conversion program is further configured for transforming the source schema into the target schema, the said target schema is the standardized data generated in compliance to the target data standard; and a validation module configured for conducting validation of the transformed data i.e., the standardized data against the said target data standard.
BRIEF DESCRIPTION OF THE DRAWINGS
The foregoing summary, as well as the following detailed description of preferred embodiments, is better understood when read in conjunction with the appended drawings. For the purpose of illustrating the invention, there is shown in the drawings various stages of the invention; however, the invention is not limited to the specific system components and methods disclosed in the drawings.

Figure 1 is the block diagram of the system (100) illustrating the multiple embodiments of the present invention.
Figure 2 is the flow diagram (200) for data transformation in compliance to a target data standard according to one embodiment of the present invention.
Figure 3 is the illustrations of the mapping process (300) between multiple source schemas and target schema.
DETAILED DESCRIPTION OF THE INVENTION
The invention will now be described with respect to various embodiments. The following description provides specific details for understanding of, and enabling description for, these embodiments of the invention. The words "comprising," "having," "containing," and "including," and other forms thereof, are intended to be equivalent in meaning and be open ended in that an item or items following any one of these words is not meant to be an exhaustive listing of such item or items, or meant to be limited to only the listed item or items.
The invention generally provides a computer-implemented system and computer-implemented method for transforming a legacy data format into a standardized data in compliance to a target data standard using a data transformation tool, characterized by enabling the said data transformation tool to automatically and dynamically generate a conversion program, wherein the said conversion program is generated on the basis of defined mapping specifications between multiple source schemas and target schema. The said mapping specifications are defined on the basis of the mapping conducted between metadata of the said source schema and data standard(s) of the said target schema.
In an embodiment of the invention the computer-implemented system provides a data transformation tool, the said data transformation tool is further comprises of set of processor-enabled program modules, each module performs a particular set of task

upon execution of the processor. One of the said program modules of the data transformation tool is an importing module which upon execution by the processor is configured for importing metadata from the multiple source schemas and further configured for importing and managing the data standard(s) from the target schema. The said data standard(s) may include different versions of the said data standards which may be imported from different external sources. The content of the metadata of the multiple source schemas include object-class and attributes which are stored in any supported database (relational database or object-oriented database). Also, the data standard(s) are the consistent data format required by organizations for ensuring the uniformity of the data. In an exemplary scenario the said data standard(s) can be considered as XML, flat-file, EDI (Electronic data interchange), XL etc. The data standard(s) facilitates with system interoperability and support within the said organizations as well as among regulatory authorities and consortia. The said data standard(s) can also be self-defined by the other organizations or authorities (may be non-regulatory authorities).
The imported metadata and the data standard(s) are further processed by a mapping module of the system; the said mapping module upon execution by processor is configured for mapping the imported metadata of the multiple source schemas with the data standard(s) of the target schema based on which mapping specifications is defined. The defined mapping specification creates a link between the multiple source schemas with the target schema for translating the data from one format to another format. The said link is used to define the semantic correspondence between metadata of the multiple source schemas with the data standard(s) of the target schema. The said mapping module further facilitates the user with a graphical user interface (GUI) based mapping which is based on built-in intelligence and weighted metadata. The built-in intelligence and weighted metadata is stored in a memory of the said system. The said built-in-intelligence is based upon the historical mapping information or the conducted mappings which gets continuously added within the memory of the said system.

Further, on the basis of the mapping along with the built-in intelligence and the weighted metadata, the processor executes the program instructions contained in an identification module which is configured for identifying closest target variable matches. The said closest target variable matches are identified when the object-class and attributes of the imported multiple source schemas are sequentially mapped with the data standard(s) of the imported target schema. On the basis of the required data standards and their different versions of the target schema, the closest matching variables are identified and extracted from the multiple source schemas. Also, the built-in-intelligence facilitates an autosuggest on the basis of weighted metadata. The autosuggest facility is provided as the said built-in-intelligence regularly updates itself with the previously conducted mappings and storing the defined mapping specifications in the system memory each time.
After identifying the closest target variable matches and defining the mapping specifications, a transformation module executed by the processor is configured for automatically generating the conversion program.
The said auto-generated conversion program transforms the said multiple source schemas into a standardized data in compliance to a target data standard. Then, the final component of the system which is a validation module is executed by the processor that is configured for validating the transformed data i.e., the standardized data against the target data standard.
In another embodiment of the invention a computer-implemented method is considered for transforming legacy data format into a standardized data complying with the target data standard. For enabling the data transformation, a data transformation tool is provided, the said data transformation tool is comprises of set of processor-enabled program modules, each module performs a particular set of task upon execution of the processor. One of the program modules is an importing module configured for importing metadata from the multiple source schemas and further importing and

managing data standard(s) of the target schema. The content of the metadata of the multiple source schemas include object-class and attributes which are stored in any supported database (relational database or object-oriented database). Also, the data standard(s) are the consistent data format required by organizations for ensuring the uniformity of the data. The said data standard(s) can be considered as XML, flat-file, EDI (Electronic data interchange), XL etc. The data standard(s) facilitates with system interoperability and support within the said organizations as well as among authorities and consortia. The said data standard(s) can also be self-defined by the other organizations or other authorities (may be non-regulatory authorities).
The imported metadata and the data standard(s) are further processed by a mapping module, which is configured for mapping the imported metadata of the multiple source schemas with the data standard(s) of the target schema based on which mapping specifications is defined. The defined mapping specification creates a link between the multiple source schemas with the target schema for translating the data from one format to another format. The said link is used to define the semantic correspondence between metadata of the multiple source schemas with the data standard(s) of the target schema. The said mapping module further facilitates the user with a graphical user interface (GUI) based mapping which is based on built-in intelligence and weighted metadata. The built-in intelligence and weighted metadata is stored in a memory of the said data transformation tool. The said built-in-intelligence is based upon the historical mapping or the previously conducted mappings which gets continuously added with the built-in-intelligence. Further, on the basis of the mapping along with the built-in intelligence and the weighted metadata, closest target variable matches are identified by an identification module. The said closest target variable matches are identified when the object-class and attributes of the imported multiple source schemas are mapped with the data standard(s) of the imported target schema. On the basis of the required data standard(s) of the target schema, appropriate or the closest matching variables are identified and extracted from the multiple source schemas. Also, the built-in-

intelligence facilitates an autosuggest with weighted metadata. The autosuggest facility is provided as the said built-in-intelligence regularly updates itself with the previously conducted mappings and storing the defined mapping specifications in its memory each time.
After identifying the closest target variable matches and defining the mapping specifications, a conversion program is generated automatically by a transformation module.
The auto-generated conversion program transforms the said multiple source schemas into a standardized data in compliance to a target data standard. Then, a validation module is configured for validating the transformed data i.e., the standardized data against the target data standard.
Thus, the said computer-implemented system and method transforms the legacy data i.e., the metadata imported from the multiple source schemas into the standardized data. The said standardized data is generated in compliance with the target data standard according to the imported data standard(s) of the target data.
Next, the preferred embodiments of the present invention will be described below based on drawings.
Figure 1 is the block diagram of a system (100) illustrating the multiple embodiments of a present invention.
The system (100) comprises a processor, memory and a data transformation tool, the said data transformation tool further comprises an importing module (106), a mapping module (108), an identification module (110), a transformation module (112) and a validation module (118). The said system may be connected with external computational devices over a communication network to communicate the metadata related to source and target schema. In accordance with various embodiments of the

present disclosure, the methods described herein are intended for operation as computer programs modules running on a computer processor.
The said importing module (106) is configured for importing metadata of the multiple source schemas (102) i.e., SI, S2, S3 and Sn and further configured for importing and managing the data standard(s) of the target schema (104), The said metadata of the multiple source schemas (102) include object-class and attributes stored in any supported database (relational database or object-oriented database). Also, the data standard(s) are the consistent data format required by organizations for ensuring the uniformity of data. The data standard(s) facilitates with system interoperability and support within the said organizations as well as among authorities and consortia. The said data standard(s) can also be self-defined by the other organizations or other authorities (may be non-regulatory authorities). The said data standard(s) are stored in the supported databases of the organizations from where it gets imported.
The mapping module (108) is further configured for processing the imported metadata and the data standard(s) of multiple source schemas (102) and target schema (104) respectively. The said mapping module (108) sequentially maps the imported metadata of the multiple source schemas (102) with the data standard(s) of the target schema (104) based upon which the mapping specifications is defined. The said mapping specifications is a process of creating a link between the said multiple source schemas (102) with the target schema (104) for translating the data from one format to another. The said link is used to defining the semantic correspondence between metadata of multiple source schemas (102) with the data standard(s) of the target schema (104). Further, the said mapping module (108) facilitates a graphical user interface (GUI) based mapping which is based on built-in intelligence and weighted metadata. The built-in-intelligence is based upon the historical mapping as the previously conducted mappings gets continuously added within the said built-in-intelligence. The built-in-intelligence is stored in the memory of the said data transformation tool (not shown in the figure). Thus, on the basis of the mapping along with the built-in intelligence and

the weighted metadata, an identification module (110) is configured for identifying closest target variable matches. The said closest target variable matches are identified when the object-class and attributes of the multiple source schemas (102) are processed with the data standard(s) of the target schema (104). On the basis of the required standard(s) of the target schema, appropriate or the closest matching variables are identified and extracted from the said multiple source schemas (102). Also, the built-in-intelligence facilitates an autosuggest with weighted metadata. The atftosuggest facility is provided as the said built-in-intelligence regularly updates itself with the previously conducted mappings and storing the defined mapping specifications in its memory each time.
A transformation module (112) on the basis of the defined mapping specifications are configured for automatically generating the conversion program (114), the said mapping specification is defined upon mapping the imported metadata of the multiple source schemas (102) with the data standard(s) of the target schema (104).
The auto-generated conversion program (114) transforms the said multiple source schemas (102) into the standardized data (116) in compliance to a target data standard
data i.e., the standardized data (116) against the target data standard (120).
Figure 2 is the flow diagram (200) for data transformation in compliance to a target data standard according to one embodiment of the present invention. A computer-implemented system comprises of different modules is configured for transforming a legacy data into the standardized data in compliance to the target data standard. To enable the said data transformation, metadata of the multiple source schemas and data standard(s) of the target schema are imported (202) through an importing module (106). The said importing module (106) is one of the modules of the said computer-implemented system. The other modules of the said computer-implemented system are

mapping module (108), identification module (110), transformation module (112) and validation module (118).
The content of the metadata of the multiple source schemas (102) include object-class and attributes which are stored in any supported database (relational database or object-oriented database). Also, the data standard(s) are the consistent data format required by the organizations for ensuring the uniformity of data. The said data standard(s) can be considered as XML, flat-file, EDI (Electronic data interchange), XL etc. The data standard(s) and their different respective versions facilitates with system interoperability and support within the said organizations as well as among authorities and consortia. The said data standard(s) can also be self-defined by the other organizations or other authorities (may be non-regulatory authorities).
Upon importing the metadata and data standard(s), mapping is done between the imported metadata of the multiple source schemas and data standard(s) of target schema (204). The mapping is conducted through the mapping module (108), the said mapping module (108) further facilitates with a graphical user interface (GUI) based mapping. The said GUI based mapping is based on the built-in intelligence and weighted metadata. The said built-in-intelligence is based upon the historical mapping or the previously conducted mappings which gets continuously added within the built-in-intelligence. After the mapping is done, mapping specifications is defined and stored in the said built-in intelligence. The said built-in-intelligence is stored in the memory of the said data transformation tool.
Then, the closest target variable matches are identified (206) based on the said built-in intelligence and weighted metadata upon mapping the imported metadata of the said multiple source schemas (102) with the data standard(s) of the said target schema (104). A transformation module (112) is then configured for automatically generating a conversion program (208); the said conversion program is configured for transforming the multiple source schemas into the standardized data (210). The validation module

(118) is further configured for validating the transformed data i.e., the standardized data in compliance to the target data standard (212).
Figure 3 is the illustrations of the mapping process (300) between multiple source schemas and target schema. The metadata of the said multiple source schemas (102) and data standard(s) of the target schema (104) is imported by an importing module (106). The imported metadata of the multiple source schemas (102) and data standard(s) of the target schema (104) is stored in the database (not shown in the figure) of the data transformation tool. The multiple source schemas (102) are SI, S2, S3 and Sn. The content of the metadata of the multiple source schemas (102) include object-class and attributes which are stored in any supported database (relational database or object-oriented database).
The mapping module (108) processes the object-class and attributes of the multiple source schemas (102) with the data standard(s) of the target schema (104). In figure 3, the attributes of source schema 1 i.e., SI are {ABC}, source schema 2 i.e., S2 are {CDE}, source schema 3 i.e., S3 are {EFX} and source schema n i.e., Sn are {XYZ}. Further the attributes of the target schema (104) are {ACDEFXZ}. During the mapping process, the mapping module (108) identifies the attributes in the multiple source schemas (102) i.e., from SI, S2, S3 to Sn which are common with the attributes of the target schema (104).
As from the figure 3, it can be seen that, plurality of two-sided arrows (304) are used for sequentially mapping the attributes of the source schemas (102) with the attributes of the target schema (104). As per the imported data standard(s) of the target schema (104), the mapping module (108) identifies and sequentially maps the common attributes of the source schemas (102) with the target schema (104). Such identified attributes from the source schemas are {AC} from source schema 1 i.e., SI; {CDE} from the source schema 2 i.e., S2; {E F X} from the source schema 3 i.e., S3 and {X Z} from the source schema n i.e., Sn are sequentially mapped with the target schema

(104). It could be seen that, the attributes {C} from SI and S2, {E} from S2 and S3 and {X} from S3 and Sn appears to be common and recurring while mapping. The said mapping module (108) is capable of sequentially mapping the available or imported metadata of the multiple source schemas (102) with the desired data standard(s) of the target schema (104), depending upon the mapping, the mapping specifications (302) is defined. Further, on the basis of the defined mapping specifications (302), the transformation module (112) automatically generates a conversion program (114). The said conversion program (114) is configured for transforming multiple source schemas (102) into the standardized data (116) in compliance to a target data standard (120). The transformed data i.e., the standardized data (116) further gets validated against the target data standard (120) using a validation module (118).
Thus the computer-implemented method and a computer-implemented system is provided for automatically and dynamically generating the conversion program based the mapping specifications, the said conversion program is configured for transforming the multiple source schemas into the standardized data. The said mapping specification is based on the mapping conducted between the metadata of the multiple source schemas with the data standard(s) of the target schema. Further, the present invention allows for graphical user interface (GUI) based mapping using built-in-intelligence and the mapping history of the previously conducted mappings.
BEST MODE OF CARRYING OUT THE INVENTION
According to the present invention, a computer-implemented system and computer-implemented method is configured for providing a data transformation tool for transforming legacy data format into a standardized data in compliance to the target data standard. In a real life example defined here is the said legacy data is created during the various clinical trials conducted by the pharmaceutical companies at various sites by their investigators or data originators. A huge amount of clinical trial data is created in multiple standards and formats. The data transformation tool of the invention

is configured to standardize the legacy data of different standards and formats from multiple sites into a single standardized format for enabling cross-study analysis and also for responding to the queries from Regulatory authorities.
According to an exemplary embodiment of the present invention, Study Data Tabulation Model (SDTM), a standard which published by Clinical Data Interchange Standards Consortium (CDISC) can be chosen as the target data standard for transforming the said legacy data format.
The various steps of the process used to perform this transformation are as follows:-
A. Upload Standards: The selected target data standard i.e., Study Data Tabulation
Model (SDTM) provides a data structure classified into clinical domains e.g. AE
(Adverse Events), DM (Demographics), DS (Disposition), etc. SDTM provides 26
domain datasets along with the specification (column definition) for each dataset. The
transformation analyst uploads the SDTM standard into UDTT using the Import
function.
B. Pre-processing; Metadata of the multiple source schemas can be imported through
an importing module of the system which can exist in multiple formats including CSV,
Excel, DBMS, etc. The content of the imported metadata of the multiple source
schemas i.e., the object-class and attributes are converted into SAS datasets using out
of the box functionality provided in SAS and is uploaded or stored to a central location
or any supported database (relational database or object-oriented database) for
conducting the transformation.
C. Create Study: A new clinical study is created in the UDTT. The target data standard
as per the exemplary embodiment i.e., SDTM is chosen. The data standards of the
target schema are also imported and are stored in the memory of the said data
transformation tool. The Transformation analyst also specifies the path where the
imported metadata of the multiple source schemas are located. On saving this

information, the imported data standards of the target schema and the imported metadata of the multiple source schemas are saved in the memory of the said data transformation tool.
For example, the following metadata for a source dataset AELOG will be imported.

Column Name Data type Length Label
AEACTIP String 2 Action Taken
AECAUDIS String 2 AE Caused Subject to discontinue
AECAUS String 2 Causality
AECODE String 18 AE Code
AECONDAT String 10 Date of last contact
AEDTXT String 200 AE Dictionary Text

The mapping module of the said computer-implemented system further maps the object-class and attributes of the imported metadata of the multiple source schemas with the imported data standards of the target schema which are stored in any supported database (relational database or object-oriented database).
Upon mapping the said object-class and attributes of the imported metadata of the multiple source schemas with the imported data standards of the target schema, closest target variable matches are identified.
The said mapping specifications is a process of creating a link between the said multiple source schemas with the target schema for translating the data from the legacy data format into a standardized data. The said link is used to defining the semantic correspondence between metadata of multiple source schemas with the data standard of the target schema.
D. Define Auto-Suggest Scores: The Auto-Suggest is used to provide weightages based on the mapping performed between the imported metadata of the multiple source

schemas with the imported data standard of the target schema. During the mapping process the said closest target variable matches are identified based on the summation of the weightages assigned to each parameter. A sample table of an auto-suggest definition is shown below:

Parameter Weightages
Variable Name 50
Variable Label 20
Data type 10
Length 10
Dataset Name 10

TOTAL 100
E. Mapping: The Mapping screen is used to map the imported metadata of the multiple
source schemas with the imported data standard of the target schema. The said
computer-implemented system provides multiple transformation mechanisms including
1:1, Concatenation of multiple imported metadata of the source schemas and SAS
expressions for derived variables (Calculating Age from Date of Birth). The present
invention also allows the user to choose a copy of the said mapping specification from
a previous conducted clinical studies and also can make the necessary changes.
F. Generate Transformation Programs: According to the exemplary embodiment of the
present invention, a transformation program is automatically generated i.e., the SAS
transformation programs through the transformation module. The said conversion
program i.e., the SAS transformation programs are automatically generated on the basis
of the defined mapping specifications, the said mapping specifications are defined upon
mapping of the imported metadata of the multiple source schemas with the imported
data standard of the target schema. SAS Macros .can be created and stored in library for
reuse purposes. The transformation program created by the system can be further
tweaked if required.

G. Execute Programs: The generated SAS transformation programs is configured for transforming the imported metadata of multiple source schemas into a standardized data in compliance to said target data standard.
H. Validation: After transforming the said legacy data i.e., the imported metadata of the multiple source schemas into the said standardized data in compliance to the said target data standard, a validation module of the said computer-implemented system is configured for validating the transformed data against the said standardized data. As per the considered example in the exemplary embodiment of the present invention, the said validation module performs validation of the transformed data against the rules specified for the selected target data standard i.e., SDTM. According to the CDISC, it provides 143 structural and data checks to be conducted to ensure that the transformed data is in compliance with the specifications of the selected target data standard i.e., SDTM. The errors and warnings generated are displayed and can be exported to a report for further action.
The illustrations of arrangements described herein are intended to provide a general understanding of the structure of various embodiments, and they are not intended to serve as a complete description of all the elements and features of methods and system that might make use of the components described herein.
The methodology and techniques described with respect to the present invention can be performed using a computer-implemented system or other computing device within which a set of instructions, when executed, may cause the said computer-implemented system to perform any one or more of the methodologies discussed above. The said computer-implemented system may include a processor embedded within the said computer-implemented system which is configured for executing the said programmed instructions or the said set of instructions. The said computer-implemented system is configured from different modules; each module is configured for executing programmed instructions or set of instruction to perform a particular task. According to

the embodiments of the present invention, the computer-implemented system may also operate as a standalone device. In some embodiments, the said computer-implemented system may be connected (e.g., using a network) to other computing devices. In a networked deployment, the computer-implemented system may operate in a client-server terminology. The said computer-implemented system may be configured to work along with a server computer, a client user computer, a personal computer (PC), a tablet PC, a laptop computer, a desktop computer, a control system, machine capable of executing a set of instructions (sequential or otherwise) that specify actions to be taken by that machine.
In accordance with various embodiments of the present invention, the computer-implemented methods described herein are intended for operation as software programs running on a computer processor - [processor embedded within the said computer-implemented system].
Although the invention has been described in terms of specific embodiments and applications, persons skilled in the art can, in light of this teaching, generate additional embodiments without exceeding the scope or departing from the spirit of the invention described herein.

We Claim:
1. A computer-implemented method for transforming legacy data into
standardized data in compliance to a target data standard, wherein the said
method comprises of:
a. importing metadata corresponding to source schema;
b. importing at least one data standard corresponding to target schema;
c. mapping the imported metadata of the said source schema with the imported
data standards of the said target schema for defining mapping specifications
using graphical user interface based on built-in historical intelligence and
weighted metadata;
d. identifying closest target variable matches based on the built-in historical
intelligence and weighted metadata upon mapping the imported metadata of
the said source schema with the data standards of the said target schema;
e. automatically generating a conversion program using the defined mapped
specification for transforming the source schema into the target schema, the
said target schema is the said standardized data generated in compliance to
the target data standard; and
f. Validating the transformed standardized data against the said target data
standard.
2. The method as claimed in claim 1, wherein the target data standard includes data standard names and their respective versions, code, data set name, data set code, designation, variable name, label, type, length, data standards required by companies and organizations and associating code lists to the target data variables and combination thereof.
3. The method as claimed in claim 1, wherein the mapping specifications includes concatenation, definition of expressions, 1:N variable mapping, code list mapping, mapping history, autosuggest and combination thereof.

4. The method as claimed in claim 1, wherein the transformation module is
modularized to be re-used across multiple source schemas.
5. A system for transforming legacy data into standardized data in compliance to
a target data standard, wherein the system comprises of:
a. an importing module configured for importing metadata of source schema
and further configured for importing and managing at least one data
standard of target schema;
b. a mapping module configured for defining mapping specifications upon
mapping the imported metadata of the said source schema with the imported
data standards of the said target schema using graphical user interface based
on historical intelligence and weighted metadata;
c. an identification module configured for identifying closest target variable
matches on the basis of the built-in historical intelligence and weighted
metadata upon mapping the metadata of the source schema with the data
standards of the target schema;
d. a transformation module configured for automatically generating a
conversion program using the defined mapped specifications, the said
conversion program is further configured for transforming the source
schema into the target schema, the said target schema is the said
standardized data generated in compliance to the target data standard; and
e. a validation module configured for conducting validation of the transformed
standardized data against the said target data standard.
6. The system as claimed in claim 5, wherein the target data standard includes data
standard names and their respective versions code, data set name, data set code,
designation, variable name, label, type, length, data standards required by
companies and organizations and associating code lists to the target data
variables and combination thereof.

7. The system as claimed in claim 5, wherein the mapping specifications includes concatenation, definition of expressions, 1:N variable mapping, code list mapping, mapping history, autosuggest and combination thereof.
8. The system as claimed in claim 5, wherein the generated transformed module is capable of being re-used across multiple source schemas.

Documents

Application Documents

#	Name	Date
1	ABSTRACT1.jpg	2018-08-11
2	463-MUM-2012-FORM 3.pdf	2018-08-11
3	463-MUM-2012-FORM 26(26-3-2012).pdf	2018-08-11
4	463-MUM-2012-FORM 2.pdf	2018-08-11
5	463-MUM-2012-FORM 2(TITLE PAGE).pdf	2018-08-11
6	463-MUM-2012-FORM 18.pdf	2018-08-11
7	463-MUM-2012-FORM 1.pdf	2018-08-11
8	463-MUM-2012-FORM 1(3-7-2012).pdf	2018-08-11
9	463-MUM-2012-FER.pdf	2018-08-11
10	463-MUM-2012-DRAWING.pdf	2018-08-11
11	463-MUM-2012-DESCRIPTION(COMPLETE).pdf	2018-08-11
12	463-MUM-2012-CORRESPONDENCE.pdf	2018-08-11
13	463-MUM-2012-CORRESPONDENCE(3-7-2012).pdf	2018-08-11
14	463-MUM-2012-CORRESPONDENCE(26-3-2012).pdf	2018-08-11
15	463-MUM-2012-CLAIMS.pdf	2018-08-11
16	463-MUM-2012-ABSTRACT.pdf	2018-08-11
17	463-MUM-2012-OTHERS [27-08-2018(online)].pdf	2018-08-27
18	463-MUM-2012-FER_SER_REPLY [27-08-2018(online)].pdf	2018-08-27
19	463-MUM-2012-COMPLETE SPECIFICATION [27-08-2018(online)].pdf	2018-08-27
20	463-MUM-2012-CLAIMS [27-08-2018(online)].pdf	2018-08-27
21	463-MUM-2012-ABSTRACT [27-08-2018(online)].pdf	2018-08-27
22	463-MUM-2012-HearingNoticeLetter22-08-2019.pdf	2019-08-22
23	463-MUM-2012-Written submissions and relevant documents (MANDATORY) [06-09-2019(online)].pdf	2019-09-06

Search Strategy

1	search_463mum2012_30-01-2018.pdf