"System And Method For Composition Of Mappings Given By Dependencies"

< Back

"System And Method For Composition Of Mappings Given By Dependencies"

Abstract: A system that facilitates composition of schema mappings. A general algorithm is provided for composing a broad class of mappings, where one or both mappings are not functions, such as constraints between two schemas and the inverse of functions. A composition component performs composition on schema mappings of disparate data sources, which schema mappings can be expressed by at least one of full, embedded, and second-order dependencies, wherein the second-order dependencies need not be in source-to-target form. The algorithm for composition further includes a procedure that tests whether the composition algorithm will terminate.

Get Free WhatsApp Updates!
Notices, Deadlines & Correspondence

Patent Information

Application #

Filing Date

02 November 2005

Publication Number

40/2009

Publication Type

INA

Invention Field

COMPUTER SCIENCE

Status

Parent Application

Applicants

MICROSOFT CORPORATION

ONE MICROSOFT WAY, REDMOND, WASHINGTON 98052, U.S.A

Inventors

1. ALAN P. NASH

ONE MICROSOFT WAY, REDMOND, WASHINGTON 98052, U.S.A

2. PHILIP A. BERNSTEIN

ONE MICROSOFT WAY, REDMOND, WASHINGTON 98052, U.S.A

3. SERGEY MELNIK

ONE MICROSOFT WAY, REDMOND, WASHINGTON 98052, U.S.A

Specification

Title. SYSTEM AND METHOD FOR COMPOSITION OF MAPPrNGS GIVEN BY DEPENDENCIES TECHNICAL FIELD [0001 ] This invention is related to systems and methods of constructing a mapping between database schemas, which mapping is composed from existing mappings. BACKGROUND OF THE INVENTION [0002] The advent of global communications networks such as the Internet has facilitated access to enormous amounts of data. The world is populated with information sources where in many cases the data is represented differently from source to source. The free flow of information prevalent today in wired and wireless regimes demands that the source and destination be compatible insofar as storing and interpreting the data for use. A major problem facing companies and individuals today is that the data existing in one model/schema needs to be accessed via a different model/schema. However, such processes are being hampered by a largely disparate and ever-changing set of models/schemas Such an example can be found in data warehousing where data is received from many different sources for storage and quick access from other sources. Converting data from one model to another model is not only time-consuming and resource intensive, but can be fraught with conversion problems. [0003] Schema mappings specify the relationships between heterogeneous database sources and are used to support data transformation, data exchange, schema evolution, constraint checking, and other tasks. Mapping composition is an operation that takes as input two mappings mapAB and mapBC between schemas A, B, C and produces a single mapping mapAc that specifies the same set of constraints between A and C that is given by the combined mappings mapAB and mapec- [0004] To illustrate the use of mapping composition, consider the following schema evolution scenano. Assume that S1, S2, S3, S4 are versions of a schema used in successive releases of a product The mapping mapu is used to migrate data from schema S1's format to schema S2's format. Similarly, mapping map23 is used to migrate data from schema S2's format to schema S3's format, and finally, map34 is used to migrate data from schema S3's format to schema S4's format. A conventional way of migrating the data from version S1 to version S4 is by executing map12, map23, map34 one by one, which is time-consuming and costly. To migrate the data from S| to S4 in a single step, mapping composition is required The mapping mapu can be obtained by first composing map12 and map23 and then composing the resulting mapping with map34. [0005] Now suppose V1 is a view defined on S, and mapivi is a function. To migrate view V1 from S1 to S2, composition is again used. The inverse of mapping map12 is composed with mapivi to obtain a mapping mapivi from S2to V1. [0006] Algorithms for mapping composition are well-known for the case where each mapping is a function (1 e , maps one database state to exactly one database state, and both mappings have the same directionality). However, there is a substantial unmet need for an algorithm which is suitable for a broader class of mappings where one or both mappings are not functions SUMMARY OF THE INVENTION [0007] The following presents a simplified summary of the invention in order to provide a basic understanding of some aspects of the invention This summary is not an extensive overview of the invention. It is not intended to identify key/critical elements of the invention or to delineate the scope of the invention. Its sole purpose is to present some concepts of the invention in a simplified form as a prelude to the more detailed description that is presented later. [0008] The invention is a general algorithm that is suitable for composing a broader class of mappings, where one or both mappings are not functions, such as constraints between two schemas and the inverse of functions. This enables construction of a mapping between database schemas from two existing mappings by composing them The algorithm for composition includes a procedure that tests whether the composition algorithm will terminate [0009] The invention disclosed and claimed herein, in one aspect thereof, comprises a system that facilitates composition of schema mappings A composition component performs composition on existing schema mappings of disparate data sources, which schema mappings can be expressed by at least one of full, embedded, and second-order dependencies, wherein the second-order dependencies need not be in source-to-target form 33 [0010] In another aspect of the subject invention, a methodology is provided for composing two schema mappings by Skolemizing the schema mappings expressed by the embedded dependencies to obtain schema mappings expressed by the second-order dependencies, computing a finite axiomatization of all constraints over input and output signatures which can be deduced from the constraints that give the schema mappings, and de-SkoIemizing a finite axiomatization to obtain a schema mapping expressed by an embedded dependency. [0011] In still another aspect thereof, a methodology is provided where the act of de-Skolemizing the finite axiomatization further comprises unnesting Skolem functions using equality to generate a conclusion; checking that all variables in the conclusion appear in a variable dependency set of the Skolem functions; generating all possible clauses that can be obtained by unifying Skolem functions that appear in the conclusion; and de-Skolemizing each clause separately by substituting each Skolem term by a new existential variable [0012] To the accomplishment of the foregoing and related ends, certain illustrative aspects of the invention are described herein in connection with the following description and the annexed drawings These aspects are indicative, however, of but a few of the various ways in which the principles of the invention can be employed and the subject invention is intended to include all such aspects and their equivalents Other advantages and novel features of the invention will become apparent from the following detailed description of the invention when considered in conjunction with the drawings BRIEF DESCRIPTION OF THE DRAWINGS (0013) FIG. 1 illustrates a diagram that represents construction of a resulting mapping by composition of two existing mappings between database schemas in accordance with the subject invention. [0014] FIG 2 illustrates a diagram that represents mapping composition of logical constraints in accordance with the invention (0015] FIG. 3 illustrates a methodology of composition of dependencies in accordance with the invention. [0016] FIG. 4 illustrates a composition methodology in accordance with the invention that imposes conditions and employs checks to determine if composition terminates to a correct answer. [0017] FIG 5 illustrates a methodology of performing de-Skolemization on a finite axiomatization to obtain an embedded dependency in accordance with the invention. [0018] FIG 6 illustrates a diagram of multiple compositions on more than three data source schemas in accordance with the invention. [0019] FIG. 7 illustrates composition of multiple mappings that can be performed using a single invocation of the algorithm to obtain the output mapping, in accordance with the invention (0020] FIG 8 illustrates a methodology for de-Skolemization in accordance with the subject invention. 10021 ] FIG. 9 illustrates a block diagram of a computer operable to execute the disclosed architecture. [0022] FIG 10 illustrates a schematic block diagram of an exemplary computing environment in accordance with the subject invention. DETAILED DESCRIPTION OF THE INVENTION [0023] The invention is now described with reference to the drawings, wherein like reference numerals are used to refer to like elements throughout In the following description, for purposes of explanation, numerous specific details are set forth in order to provide a thorough understanding of the subject invention. It may be evident, however, that the invention can be practiced without these specific details. In other instances, well-known structures and devices are shown in block diagram form in order to facilitate describing the invention. [0024] As used in this application, the terms "component" and "system" are intended to refer to a computer-related entity, either hardware, a combination of hardware and software, software, or software in execution. For example, a component can be, but is not limited to being, a process running on a processor, a processor, an object, an executable, a thread of execution, a program, and/or a computer. By way of illustration, both an application running on a server and the server can be a component. One or more components can reside within a process and/or thread of execution, and a component can be localized on one computer and/or distributed between two or more computers COMPOSITION OF MAPPINGS [0025] The subject invention is a general algorithm that is suitable for composing a broader class of mappings, where one or both mappings are not functions, such as constraints between two schemas (such as map13, map23, and map34) and the inverse of functions (such as the inverse of map12) [0026] FIG. 1 illustrates a diagram 100 that represents construction of a resulting mapping 102 by composition of two existing mappings (104 and 106) between database schemas (not shown) in accordance with the subject invention. A first schema mapping 104 (denoted SCHEMA MAPPING12) is provided and which can be a function or a nonfunction. Similarly, a second schema mapping 106 (denoted SCHEMA MAPPING23) is provided and can be a function or a non-function Algorithms for mapping composition are well-known for the case where each mapping is a function. Thus, the invention addresses instances only where one or both of the mapping schemas (104 and 106) is a non-function. [0027] As illustrated by the dashed lines, the following scenarios are addressed: when the first schema mapping 104 is a function, the second schema mapping 106 is a nonfunction; when the first schema mapping 104 is a non-function, the second schema mapping 106 is a function; and, when the first schema mapping 104 is a non-function, the second schema mapping 106 is a non-function. The schema mappings (104 and 106) are composed using a composition algorithm component 108 to generate the resulting mapping 102 (denoted SCHEMA MAPPDNG13). The composition component 108 employs a deductive procedure to obtain a finite axiomatization of all constraints that need to appear in the composition mapping. This is described in detail infra. [0028] In any of the three non-function mappings, the mappings can be represented by logical constraints, or clauses. FIG. 2 illustrates a diagram 200 that represents mapping composition of logical constraints in accordance with the invention. There are provided three data base schemas 202 (denoted S1, S2, and S3) from which first and second schema mappings (204 and 206) (denoted respectively as MAPPING): and MAPPrNG23) are derived [0029) A compose component 208 (similar to component 108) facilitates composition of the two schema mappings (204 and 206) into a resulting schema mapping 210 (denoted MAPPING13). The considered constraint languages include full dependencies (CQ0~ mappings), embedded dependencies (CQ= mappings), or second-order dependencies (SOCQ= mappings, also called second-order tuple-generating dependencies). Thus, each of the first and second mappings (204 and 206) can include constraints which are full, embedded and/or second-order dependencies. Given two finite CQo=, CQ=, or SOCQ= mappings mapu and map23 that are not necessarily functional, the challenge is to find a CQ0= , CQ=, or SOCQ= mapping that specifies the resulting mapping 210 of the composition of map12 and map23. [0030] Referring now to FIG. 3, there is illustrated a methodology of composition of dependencies in accordance with the invention. While, for purposes of simplicity of explanation, the one or more methodologies shown herein, eg , in the form of a flow chart, are shown and described as a series of acts, it is to be understood and appreciated that the subject invention is not limited by the order of acts, as some acts may, in accordance with the invention, occur in a different order and/or concurrently with other acts from that shown and described herein. For example, those skilled in the art will understand and appreciate that a methodology could alternatively be represented as a series of interrelated states or events, such as in a state diagram Moreover, not all illustrated acts may be required to implement a methodology in accordance with the invention. [0031] The composition is computed in three steps the CQ' mappings are Skolemized to get the second-order dependencies (SOCQ=) mappings. Skolemization is a way to remove existential quantifiers from a formula. Next, a finite axiomatization of all constraints over the input and output signatures is found that can be deduced from the constraints that give the mappings. Finally, the finite axiomatization is de-Skolemized to get a CQ= mapping. Accordingly there is provided a methodology, such that, at 300, the embedded dependencies are Skolemized to obtain the second-order dependencies. At 302, a finite axiomatization of all constraints over input and output signatures is found. The axiomatizations can be deduced from constraints that give the mappings. At 304, the finite axiomatization is de-Skolemized to obtain the embedded dependencies. [0032) In the case of CQ0" mappings and SOCQ= mappings, only act 302 of FIG 3 applies. Accordingly, FIG. 4 illustrates a composition methodology in accordance with the invention that imposes conditions and employs checks to determine if composition terminates to a correct answer At 400, the compose algorithm is run. The following algorithm compose assumes that the input mapping map12 is specified by finite sets of CQ0" dependencies Σ12 between schenias S1 and S2, the input mapping map23 is specified by Σ23 between S2 and S3, and the output mapping map13 is given by Σ13 between S1 and S3. (Generalization of this basic algorithm for CQ= and SOCQ= mappings is presented infra.) [0033] Algorithm Compose (Σ12,, Σ23) .- Set I .= E12 v S23 Repeat Set 2' := 0 For every pair of clauses V where 2 2. R(xy)→S(xv) 3. R(v,v,) → S(v1v1) [2] x := v1 y :=v1 4. R(v,v,)→S(v1v1) [3] 5. S(zz)→(zz) 6 S(v1v1)→ T(v1v1) [5]z:=v1 7. R(v1v1)→T(v1v1) [4,6] [0075] The following replacements have been made for each rule: 1. Axiom from : removed by 1. 2. Axiom from Σ; unchanged by 2. 3. Expand rename-1 replaced with vi by 3. 4. Resolution on 1,3: trivial expand/rename by 4. 5. Axiom from I; unchanged by 2. 6. Expand rename: 1 replaced with vi by 3. 7. Resolution on 4,6: resolution by 5. [0076] y is a valid deduction witnessing Σ |- £ since axioms from A are no longer used, and since replacements 1 through 5 above ensure that rule R is correctly deduced from the previously replaced rules. Additionally,  £ |- R(c ) is witnessed by a deduction y which consists of the following sequence, - the axioms from A which were removed from y to obtain y. - £ followed by £' obtained from £ by renaming each variables ve to the corresponding constant c, and - a sequence of resolution steps using each axiom from A and starting with £' [0077] Definition 2. Given an instance D, the result of chasing D with constraints Z c 1C( SOCQ=) and the set of Skolem functions F is abbreviated by chase(D, Σ, F) and is denoted by the database D" obtained from (Equation Removed) where c is a tuple of constants, A is the set of facts given by D: (Equation Removed) and Φ is the set of facts given by F: (Equation Removed) as follows Define c0 = c1 iffΣ U  u Φ |- C0 = C1. (0078) Now to obtain D" from D\ pick one constant c0 from every equivalence class and replace every constant in that equivalence class with C0. That is, D" := U /= All functions in F are required to have the same domain. If they are finite, then chase(Z), Σ, F) is finite. This definition is a variation on the usual definition, where the functions F are constructed during the chase process. (0079] Proposition 2. Chase(D, Σ, F)|= Σ. Proof. Set D" := chase(D,Z ,F) |= Σ and assume D" |= £ for some £ ε Σ. This implies that there must be some assignment of constants c to the variables x in the premises so that the premise in £ holds and the conclusion fails Assume that the conclusion consists of a single relational atom R( y) or a single equation y1 = y2 where y is a sequence of terms over x . Set y' to the sequence of terms obtained from y by replacing every variable in .Y with the corresponding constant c . If the former holds then  u Φ, £ |- R(y') holds and R(y') is not in D'. Since E ε Σ, Proposition 1 implies that Σ w  UΦ |- R( y') holds, contradicting the definition of D'. If the latter holds then  =y'2andy'i =y'i holds. Since £ ε Σ. Proposition 1 implies that Σ u U Φ |- y'i = v'2 holds, contradicting the definition of = and D". COMPOSITION OF CQ0= -MAPPINGS (0080) The following description of composition begins by studying composition of CQ0-mappmgs and CQ0= -mappings. Later, the techniques introduced to handle these cases are extended to handle SOCQ= and CQ~-mappings. Necessary and sufficient, non-computable conditions for the composition of two CQ0= -mappings to be a CQ= -mapping are given (see Theorem 1 below). Then, CQ0 and CQ0= are shown not to be closed under composition (see Example 3 below) and, furthermore, that determining whether the composition of two CQ0= -mappings is a CQ0= -mapping is undecidable (see Theorem 2 infra). An algorithm for composition (CQ0= COMPOSE) is given, which works correctly on CQ0= -mappings satisfying the conditions of Theorem 3 infra, which can be checked in polynomial time. Also, a subset of CQ0= recognizable in polynomial time, called good-CQ0=, is defined, which is closed under composition (see Theorem 4 infra). [0081] The composition of two CQ0= -mappings given by (σ1,σ2,,Σ12) and (σ2, σ3, Σ23) can be computed using the following algorithm: Algorithm CQo COMPOSE (Ea2, S23) Set £-=E12 u S23 Repeat Set I := 0 For every pair 6 £ If there exists a o2-resolvent £ of ^""J1 and there is no variant of { in E then set E' := I u {{} Set Z := I u I' Until £' = 0 Return I13 := Σ| on [0082] The next step is to verify the correctness of the algorithm and study the conditions under which the algorithm terminates. Notice that CQ0 and CQ0= are not closed under composition, as the following example shows. [0083] Example 3. Consider the CQo-mappings mn and m23 given by (a1, a2, Σ12) and (02, (7.i, Σ23) where I|? is (Equation Removed) and where σ1, = {R} ,σ2,- {S}, and σ3, = {T}. Together, these constraints say that RS  Tand that S is closed under transitive closure. This implies tc{R)  T, where c(R) denotes the transitive closure of R. It can be seen that the only CQ0= -constraints that hold between R and T are constraints of the form (Equation Removed) but no finite set of them expresses tc(R)  T. [0084] In fact, the composition m12 m23 is not even expressible in FO (first-order logic) Otherwise, given an FO sentence such that (Equation Removed) an FO formula xy) can be created, which is obtained by replacing every occurrence of T(uv) in  with x V y  v. Then given a domain D with R D2 , R |= y[ab] (Equation Removed) and therefore, (.x,v) y (x,y') says that R is a connected graph, which cannot be expressed in FO [0085] When is the composition of two CQ0 -mappings a CQ0-mapping? Theorem 1 herein gives necessary and sufficient conditions An obstacle for the composition to be a CQ0-mapping can be recursion, yet recursion is not always a problem In Theorem 3, sufficient conditions are provided that can be checked efficiently, and in Theorem 4, it is shown that good-CQ0 a subset of CQ0=efined below, is closed under composition. All these results also hold for CQ0. [0086] Theorem 1. If the CQ0-mappings m1 m2 are given by (σ1, σ2, Σ12) and (σ2, σ3, Σ23) with Σ123:=Σ12 u Σ23 and σ13 = σ1 Uσ3. then the following are equivalent: (1) There is a finite set of constraints Σ13 lCfCQ0) over a signature σ13 such that m :- m1 • m2 is given by (σ1, σ3, σ13). (2) There is a finite set of constraints ΣI3 IC( CQ0) over the signature σ13 such that (Equation Removed) (3) There is m such that for every £ over the signature σ13 satisfying Σ123 |- £ there is a deduction of £ from £123 using at most m σ2-resolutions. [0087] Proof. The proof uses Lemmas 1 and 2 below The proof first shows the equivalence of (1) and (2), and then shows the equivalence of (2) and (3). (Equation Removed) This shows that (1) holds. Conversely, assume (1) holds. Then (A, C) |= DC(CQ0= Σ123) 0lJ iff 3B (A, B, C) 1= £123 (by Lemma 1)) iff (A, B) € my mi (by definition of-) iff (A, O 1= £13 (since (2) holds) iff {A, C) NDC( CQ;, £13). (by soundness of DC) This shows that (2) holds. [0088) Now assume (3) holds. Set £ to the set of all constraints in DC( CQ0=, Σ123j) 0ij which can be deduced using at most m σ2-resolutions and no other resolutions. Σvery constraint in £ can be obtained by expand/rename from a finite subset T.]y c £. It can be shown that (2) holds. [0089] Assume that there is a deduction 7 witnessing £123 I" £ Since (3) holds, it can be assumed that 7 has m' (^-resolutions. By Lemma 2, there is a deduction 7 witnessing £1231" £ also with tri (72-resolutions, and with all of them occurring before any other resolutions Since the last line of 7 does not contain any symbols from a2, it can be assumed that 7does not contain any lines containing symbols from a2 after the last (^-resolution. [0090] Break √ into two parts' √1 the initial segment of y up to and including the last a2-resolution, and √2 the remainder of √- By definition of Σ, every constraint 4' in 71 must be in I, and therefore Σ131- t/> holds. Σvery constraint y in √2 does not contain any symbols from o2, and Σ123 L13 Q Σu. which imply that Σ13 H 4> Therefore. Σ13 |= Eas desired. (0091] Conversely, assume (2) holds Take m to be the total number of (^-resolutions needed to deduce every 4' € £13 from Σ123. Assume Σ123 h £ Then there is a deduction 7 witnessing Σ]3 h |. Clearly, 7has no (^-resolutions. From 7, one can obtain Y witnessing £)231~ £ by appending to 7 a deduction of every constraint in Σn and by replacing every line where an axiom from Σ13 is used by a vacuous expand/rename of the line where the deduction of that axiom ends. Clearly, Y has exactly m (72-resolutions as desired. This shows that (3) holds. [0092] Lemma 1. Under the hypotheses of Theorem 1, the following are equivalent: (Equation Removed) 10093] Proof. Assume (A. B,Q\= Σ12J for some B. Then (A, B, Q \= DC( CQ^, Z,23) (by soundness) and therefore (A, Q N DC(CQ„ , Σi23) |0ii since B is not mentioned in DC(CQ;,Σ123) [0094] Conversely, assume (A, Q N DC(CQ; , Σ,23) L_ . Set (Equation Removed) If the chase terminates and A = A' and C=C, then {A, B,Q^ Z123 by Proposition 2, which implies (A, B) 1= I]2 and (B, C) 1= Σ23, as desired. [0095] It is clear that the chase terminates, since no new constants are introduced Now assume, to get a contradiction, that A * A' and C*C. Set &Ac to the set of facts given by A and C. Then, (Equation Removed) where R is a relation in A or C not containing c or where Co, ci are distinct constants in A or C. 100961 Consider the former case, the latter is similar. If Σ123 u A,^c •" ^{ c ) then by Proposition 1 there exists £ such that S123 H- £ and A 0, the proof shows how to obtain 7*,M. TO prove the case where / = 0, simply set Tn-ij- := 7*,o where /' fulfills the conditions above Once ym/ is obtained for some /'. set y := ym,r and the lemma is proved |0100] Consider the line s containing 5, the (k +l)lh Oi-resolution in ykJ of, for example, lines / andy containing, respectively, a and ft Consider also the line r containing \ the /-th non-a2-resolution in yu of, for example, lines n and r2 containing, respectively, Xj and \2. |0J 01 ] Now consider several cases. If i,j S(x) where 1. $ z ) is a conjunction of atoms over 0\ u a2 2. there is no atom S( w) in # z ) with {.v} c {iv}, 3. {.? } £ {y},and 4. 5 and 5" are both relation symbol in 0\ or both in a2 can be deduced from Σ123 using only Oi-unifications or only (^-unifications. Define good-CQo simi Iarly. [0124] Theorem 4. Good- CQ^ and good-CQo are closed under composition and inverse Comment 1. CQl -queries are included in good- CQj" -mappings Notice that total and surjective CQ^ -mappings are not closed under composition. [0125] Example 6. Consider the CQo-mappings mn and m?3 given by ( is there a SOCQ=-mapping W13 given by (a)( 03, Sn) satisfying »iu = mn • f»22 such that Σ|j c SOCQ= has no function symbols or constants other than those appearing in Z[i u Z239 If so, it is said that z/112 and »i23 have a conservative composition. [0128) Conservative composition is important because ultimately there is a need to eliminate all function symbols to obtain CQ=-mappings, and it is not known how to eliminate function symbols introduced in equations such as those used in the proof of Theorem 5 below. Theorems 1 and 2 from the previous section show that SOCQ and SOCQ= are not closed under conservative composition, and that determining whether the conservative composition of two SOCQ=-mappings is a SOCQ=- mapping is undecidable. As in the case of CQQ -mappings, necessary and sufficient non-computable conditions are given for two SOCQ=-mappings to have a conservative composition (see Theorem 6 below), and sufficient conditions for conservative composition are given that can be checked efficiently. An algorithm for conservative composition (SOCQ'COMPOSE) is given which works correctly on SOCQ=-mappings, satisfying these conditions Additionally, good-SOCQ= is defined as a polynomial-time subset of SOCQ=, which is closed under conservative composition, hi contrast with the previous section, it turns out that in this case, equations are essential and therefore, these results do not hold for SOCQ. [0129] Before proceeding to the results, the semantics of IC(SOCQ=) constraints are briefly described. A question is, what is ihe universe from which the functions can take values, i.e., what is their allowed range? Intuitively, the problem is with the universe of the existentially quantified intermediate database. All databases are required to be finite (/ e , all relations are finite), but to have an implicit countably infinite universe. The functions are allowed to take any values from this implicit universe, as long as their range is finite These assumptions ensure that the deductive system introduced in Section 2 is sound for 1C(S0CQ=), which is needed for Theorem 6 below. [0130] Theorem 5. SOCQ=-mappings are closed under composition Proof. If SOCQ=-mappings mu and /M23 are given by (Oj, a2, Z12) and (o2, cr3, Z23), Σn >s obtained such that the composition m)2 • m2y is given by (<7|, a2, Σ23) as follows Set £13 .= I'i2 u £'23 where I'^ and IS3 are obtained from I12 and I23 by replacing every occurrence of an atom R{ .v ) where R e a2 with_/?«( .V ), gR{ x ) = /i«( x ), where fin( x ) .= _At£ (x } x =][x), and/, gn, hR are new function symbols Then if (A,Q e mi2 • W23. then there is a B such that (AJB) \= mu and (B,Q t= m2}. Set g«( c ) = /;«( c) iff R{ c ) holds in B for every relation R in 02 by using any two values from the domain. Furthermore, set J[c) = c, if c is in the active domain of B and/(c) = c\ olhenvise, where c' is some value from the domain. Then (A,C) \= £13. Conversely, if {A,C) N I13, then set/? := {c:fin(c),gn(c) = h^c)} for every relation R in a2 to obtain B such that {AJ1) \= ni\2 and (B,C) t= m2i. fin(x) is a finite predicate, since the range of/ is finite Hence, every relation R in B is finite. [01311 Theorem 6. If the SOCQ -mappings nm, '"23 are given by (ffi, o2, £12) and (<72, Oj. Iii) with Ii23 := Σ12 u I23 and <7|3 = (7| u 03, then the following are equivalent; 1. There is a finite set of constraints £13 C IC(SOCQ=) over the signature On such that m := »i|2 • /M23 is given by (Oj, a3, £13) where S13 has no function symbols or constants other than those appearing in £123 2 There is a finite set of constraints I13 C IC(SOCQ=") over the signature o)3 such that DC(SOCQ=. £123) a = DC(SOCQ=, £u) where £,3 has no function symbols or constants other than those appearing in £123 3. There is k such that for every £ over 013 satisfying £1231" £ there is a deduction of I from £123 using at most k ffi-resolutions. Proof. Essentially the same as that of Theorem 3, using Lemma 3 below instead of Lemma 1. 10132] Lemma 3. Under the hypotheses of Theorem 6, the following are equivalent' 1. (.4,01= DC(SOCQ=,Zi2j) 2. 35(/l,S,0!=£i23. Theorem 6 suggests essentially the same algorithm for composition of SOCQ~-mappings as CQ„ COMPOSE; call it SOCQ=COMPOSE. The only difference between them is that SOCQ'COMPOSE operates on IC(SOCQ=) constraints while CQjJ COMPOSE operates on IC( CQo ) constraints. Correctness of SOCQ=COMPOSE, sufficient conditions for its termination, and good-SOCQ=-mappings are defined for SOCQ= just like for CQl. COMPOSITION OF CQ=-MAPPRMGS [0133) This section considers how to convert SOCQ=-mappings back to CQ=-mappings. Consider the case of CQ-mappings and CQ=-mappings. In order to compute the composition of two CQ=-mappings mn, ;«23 given by (, then Σ t= . The converse may not hold There are known cases in which there exists such h* which is also complete (/ e , the converse does hold). For example, this is the case when Σ has stratified witnesses. For sake of brevity, options for the implementation of h* are not described. DESKOLEMIZE works for any h* satisfying the condition above; the more complete h* is, the larger the set of inputs on which DESKOLEMIZE succeeds. Accordingly, FIG. 8 illustrates a methodology for de-Skolemization in accordance with the subject invention. [0137] Procedure DESKOLEMIZE (Σ) At 800, an unnesting process is initiated. Set A i := {§: $ e Σ} where ' is equivalent to and obtained from 0by "unnesting" terms and eliminating terms from relational atoms and from the conclusion so that in <(>': (a) Function symbols occur only in equations in the premise (b) Every term/(« ) occurs in only one atom. (c) Every equation is of the form v =/{u ) for some variable v (a term variable fory), function/ and sequence of variables u (a defining equation) or of the form v = u for two term variables u and v (a restricting equation) (d) The conclusion contains at most one atom, which is not a defining equation. Relational atoms in which a term variable occurs are called restricting atoms. If A is a restricting relational atom or a restricting equation in which v occurs, where v is a term variable for/ A is called an/restricting atom. A constraint is called restricted if it has restricting atoms in the premise. [0138] At 802, a check for cycles is performed. For every e A i, construct the graph G# where the edges are variables in ^and where there is an edge (v, u) iff there is an equation of the form v=f{. u .). If G# has a cycle, abort. Otherwise, set A 2 := A |. [0139J At 804, a check for argument compatibility is performed For every e A:, check that ^does not contain two atoms with the same function symbol. If it does, abort Otherwise, set A3 = A2. [0140] At 806, variables are aligned. Rename the variables in A3 to obtain A4 satisfying' (a) For every function symbol/and any two equations of the form v = J[ u ) and y ~J{ x) in A4, v is the same variable as y and u is the same sequence of variables as ?. (b) For every two different function symbols/and g and any two equations of the form v -J[Ti ) and _y = g( x ) in A4, v and y are different variables If this is not possible, abort. 10141] At 808, restricting atoms are eliminated. Pick some ordering of the function symbols in A4:/i,... -fk- Set Ao = A4. For n = 1,.. ,k- 1, set 4,+, := {fa: fa& \} where fa is obtained from ^as follows. Set \p to be ^ with the/,+i-restncting atoms removed from the premise. If \ h* \p, set fa:- \j/; otherwise set fa := fa. In any case, 4I+I —4i Set A 5 :=A*- {0142] At 810, constraints with restricting atoms are eliminated. Set A6 to be the set of constraints e A 5 which can not be eliminated according to the following test ^can be eliminated if (a) 0 contains an/-restricting atom in the premise, and (b) there is no constraint \j/ e A $ which has an/restricting atom in the conclusion and no/restricting atoms in the premise. [0143] At 812, a check for restricted constraints is performed. Set A to the set of unrestricted constraints in A6. If there is any 0 e \6 such that A V* fa abort. Otherwise, set A7 := A. |0I44] At 814, a check for dependencies is performed. For every

C Ag} where V** is defined as follows. If there is a function/which appears in every 0 e , then the premise of fa consists of the atoms in all the premises in 4> and the conclusion of fa consists of the atoms in all the conclusions of* (remove duplicate atoms). Otherwise, fa is some constraint in *. Notice that AQ D Ag since \f/^) = for every

with all equations removed and where the conclusion of fi is the conclusion of , with all variables appearing on the left of equations in ^existentially quantified. [0148) At 820, unnecessary 3 variables are eliminated. Set A12 := {': e A11} and return A12 where ' is like $ but where existentially quantified variables which do not appear in the conclusion atom have been removed (with their corresponding existential quantifier). [0149] Example 8. Consider three runs of the algorithm DESKOLEMIZE (£'„), for / e {1, 2, 3}. Let i;3 ={71,... , 7} be a set of the following (unnested) SOCQ=- dependencies (Equation Removed) For completeness, note that each ZJ3 is obtained by first de-Skolemizing CQ=- mappings given by ΣJ, and I'2J, which are shown below, and then invoking SOCQ'COMPOSE- (Equation Removed) [0150] In all three runs of DESKOLEMIZE (Z[}), processes 802 and 804 pass, since each of 7t, 7>, 7J is cycle-free and has no multiple atoms with the same function symbol Process 806 has no effect, since the variable names of the dependencies are already aligned. The remaining processes are explained below. [0151] In the run DESKOLEMIZE ({y\}), process 808 has no effect, because {71} is a singleton set. Its only member 71 gets eliminated in process 810, since there are no rules in (YJ } with/restricting atoms in conclusions Intuitively, 71 is a tautology because an/ can be constructed whose range is disjoint with R\. Hence, {71 \ is equivalent to the empty set of constraints, which is trivially in IC(CQ). [0152] In the run DESKOLEMIZE ({71, T>}), 75 contains an/restncting atom T2 in its conclusion. Hence, the restricted constraint 71 can not be eliminated in process 810, and so de-SkoIemization aborts in process 812. [0153] In the run DESKOLEMIZE {{71, 72, 73}), de-Skolemization is possible despite 7J. In process 808, Ao - {71, 7;, 75}. By considering the only function symbol/ get Ai = {\f/, 72, ~f\\ sAo where I/- is obtained by eliminating the restricting atom R\(y) from the premise of 71 as (Equation Removed) [0154] Clearly, Ao H* \p, since Ao ID {71,75} h* \f/. Ai has no restncting constraints, so process 810 has no effect and process 812 passes. Process 814 succeeds with A8 = A7 = {^ 7!, 7<}, since every dependency in Aj has at most one term variable y in its conclusion For example, doing the calculation for 75 yields V-$ = {,x1y),DyjJC = D-#j = Uuet'ii D-$M = {*}. In process 816, combining the dependencies for $ = As yields y4 := *i (v), R2(x), v = Ax) -* r,(v), 720'),*i(v) (Combinations resulting from proper subsets of Ag are not shown for brevity.) In process 818, the redundant constraints which include ty, -$., 73 are removed because they share the premise with 74 and their conclusion is subsumed by that of 74; the result is A10 = {71}. Finally, at 818, replacing function/"by an existential variable in 74 yields (Equation Removed) Thus, DESKOLEMIZE ({71,7., 73}) C IC(CQ). [0155] Theorem 7. ^DESKOLEMIZE (Σ) succeeds on input Σ C 1C(S0CQ=). then Σ' •= DESKOLEMIZE (Σ) satisfies (Equation Removed) The following follows from the descnption of the algonthm. (0156) Proposition 3. 1. DESKOLEMIZE (Σ) succeeds on input Σ C IC(SOCQ=) iff it reaches process 816, which can be checked in polynomial time in the size ofL 2. Furthermore, if for some constant I independent of'Σ, for every function symbol in Ag there are no more than I rides in Ag in which f occurs, then DESKOLEMIZE (Σ) runs in polynomial time in the size oft>. [01571 From the first part of Proposition 3, it follows that sufficient conditions for success of CQ'COMPOSE (Σ12, Σ23) can be checked in polynomial time, because the size of the result of SKOLEMIZE is polynomial in the size of its input, and because sufficient conditions can be checked for the termination of SOCQ'COMPOSE (Σ'|2, ^23) in polynomial time [0158] Since DESKOLEMIZE (Σ) may produce a result of size exponential in the size of Σ due to process 816, CQXOMPOSE (Σ|j, Σ23) may produce a result of size exponential in the size of Σ12 U Σ23. The following example shows that in the general case this is unavoidable. However, the second part of Proposition 3 shows that we can efficiently check for more restrictive conditions that ensure not only that DESKOLEMIZE succeeds, but also that it terminates in polynomial time. [0159) Theorem 8. For any subexponential function/ there are source-to-target CQ-mappings m\2 and m^i given by (ffi, <72. Σ12) and (02, O3, Σ23) such that mu = m\2 • »»2j is a CQ-mapping which is not given by (0\, ayX\i) for any Σu whose size is less thanks) where s is the size of Σ12 U Σ23. [0160] Proof. (Outline) Pick k large enough so tha\j{2kt) < 2* where / is the size of the longest constraint below (which 9 is about 18 + log k characters long). Such k must exist, since/is subexponential. Set [k] := {1,. .., k}. Consider the CQ=-mappings /«|2 and 7W23 given by (Oj, o2, Σ|2) and (02, 03, Σ;3) where (Equation Removed) for 1 e [*]. and where ax = {/?,: 1 e {0,... , A"} \,o2 = {Sl.i e {0 k)\, and a3 = {7",: / e [k]). The composition mn := ni\2 • mu is given by the set Σ13 of constraints such that Z C [k], where Rrfx) := AI(LZ R<{*) and similarly for Tz. On the other hand, wl3 is not given by any (o\, 03, Σ) where Σ has fewer than 2* -1 constraints. DOMArN. RANGE, AND OTHER OPERATORS |0161 ] Composition is one of the operators that take models and/or mappings as inputs and give models and/or mappings as output Composition can be performed using a combination of other operators, which in tum can be implemented using similar techniques as outlined above. These other operators include Domain dom(w) Range rng(/«) Intersection Ac\B Identity id(.4) Cross product A* B Intersection ni\ n in2 Inverse m'] [0162] These operators are defined as follows. (Equation Removed) [0163] In a relational case, with signatures, a signature is associated to every model and, input and output signatures to every mapping, and the obvious signature-compatibility conditions on these operators are required. [0164] Proposition 4. Every C 2 CQo is closed under identity, cross product and intersection. Pi oof If mi: and W34 are given by (0|, 02, S12) and (03, 04, Σ34), and ,4 and B are given by (ai, Ii) and (Qi, I2), then (Equation Removed) [01651 To express identity, refer to the third auxiliary signature a': (which is normally ignored) which contains, for every relation symbol R in 02, a relation symbol R of the same anty. In this case, 0\ = 02, so a\ = a\. • id(A) is given by (

Documents

Application Documents

#	Name	Date
1	2929-del-2005-gpa.pdf	2011-08-21
2	2929-del-2005-form-5.pdf	2011-08-21
3	2929-del-2005-form-3.pdf	2011-08-21
4	2929-del-2005-form-2.pdf	2011-08-21
5	2929-del-2005-form-18.pdf	2011-08-21
6	2929-del-2005-form-13.pdf	2011-08-21
7	2929-del-2005-form-1.pdf	2011-08-21
8	2929-del-2005-drawings.pdf	2011-08-21
9	2929-del-2005-description (complete).pdf	2011-08-21
10	2929-del-2005-correspondence-others.pdf	2011-08-21
11	2929-del-2005-claims.pdf	2011-08-21
12	2929-del-2005-assignment.pdf	2011-08-21
13	2929-del-2005-abstract.pdf	2011-08-21
14	2929-DEL-2005_EXAMREPORT.pdf	2016-06-30