“Compression Of Numerical Data”

< Back

“Compression Of Numerical Data”

Abstract: The invention relates to a computer-implemented method for compressing numerical data comprising a structured set of floating point actual values. A floating point value is defined by a sign, an exponent and a mantissa. The method comprises computing a floating point predicted value related to a target actual value of the set. The computing includes performing operations on integers corresponding to the sign, to the exponent and/or to the mantissa of actual values of a subset of the set. The method also comprises storing a bit sequence representative of a difference between integers derived from the target actual value and the predicted value. Such a method is particularly efficient for reducing the storage size of a CAD file.

Get Free WhatsApp Updates!
Notices, Deadlines & Correspondence

Patent Information

Application #

Filing Date

10 May 2011

Publication Number

49/2012

Publication Type

INA

Invention Field

ELECTRONICS

Status

Parent Application

Patent Number

Legal Status

Grant Date

2022-12-15

Renewal Date

Applicants

DASSAULT SYSTEMES

10 Rue Marcel Dassault 78140 VELIZY VILLACOUBLAY FRANCE

Inventors

1. Arnaud DESLANDES

70 rue des Nouvelles 92150 SURESNES FRANCE

Specification

FORM 2 THE PATENTS ACT, 1970 (39 of 1970) & THE PATENTS RULES, 2003 COMPLETE SPECIFICATION (See section 10, rule 13) “COMPRESSION OF NUMERICAL DATA” DASSAULT SYSTEMES, of 10 Rue Marcel Dassault, 78140 VELIZY VILLACOUBLAY, FRANCE The following specification particularly describes the invention and the manner in which it is to be performed. COMPRESSION AND DECOMPRESSION OF NUMERICAL DATA FIELD OF THE INVENTION The invention relates to the field of computer programs and systems, and more specifically to a computer-implemented method for compressing and decompressing numerical data. BACKGROUND OF THE INVENTION Computer-aided techniques are known to include Computer-Aided Design or CAD, which relates to software solutions for authoring product design. Similarly, CAE is an acronym for Computer-Aided Engineering, e.g. it relates to software solutions for simulating the physical behavior of a future product. CAM stands for Computer-Aided Manufacturing and typically includes software solutions for defining manufacturing processes and operations. In computer-aided techniques, the graphical user interface (GUI) plays an important role as regards the efficiency of the technique. Most of the operations required for manipulating and/or navigating the modeled objects may be performed by the user (e.g. the designers) on the GUI. Especially, the user may create, modify, and delete the modeled objects forming the product, and also explore the product so as to comprehend how modeled objects are interrelated, e.g. via a product structure. Traditionally, these operations are carried out through dedicated menus and icons which are located on the sides of the GUI. Recently, CAD systems such as CATIA allow calling these functions nearby the representation of the product. The designer does not need anymore to move the mouse towards menus and icons. Operations are thus available within reach of the mouse. In addition, the operations behave semantically: for a given operation selected by the designer, the CAD system may suggests to the designer, still nearby the mouse, a set of new operations according to the former selected operation that the designer is likely to select. Also known are Product Lifecycle Management (PLM) solutions, which refer to a business strategy that helps companies to share product data, apply common processes, and leverage corporate knowledge for the development of products from conception to the end of their life, across the concept of extended enterprise. By including the actors (company departments, business partners, suppliers, Original Equipment Manufacturers (OEM), and customers), PLM may allow this network to operate as a single entity to conceptualize, design, build, and support products and processes. Some PLM solutions make it for instance possible to design and develop products by creating digital mockups (a 3D graphical model of a product). The digital product may be first defined and simulated using an appropriate application. Then, the lean digital manufacturing processes may be defined and modeled. The PLM solutions provided by Dassault Systemes (under the trademarks CATIA, ENOVIA and DELMIA) provides an Engineering Hub, which organizes product engineering knowledge, a Manufacturing Hub, which manages manufacturing engineering knowledge, and an Enterprise Hub which enables enterprise integrations and connections into both the Engineering and Manufacturing Hubs. All together the system delivers an open object model linking products, processes, resources to enable dynamic, knowledge-based product creation and decision support that drives optimized product definition, manufacturing preparation, production and service. Such PLM solutions comprise a relational database of products. The database comprises a set of textual data and relations between the data. Data typically include technical data related to the products said data being ordered in a hierarchy of data and are indexed to be searchable. The data are representative of the modeled objects, which are often modeled products and processes. Product lifecycle information, including product configuration, process knowledge and resources information are typically intended to be edited in a collaborative way. A number of systems and programs are thus offered on the market for the design of objects (or parts) or assemblies of objects, forming a product, such as the one provided by Dassault Systèmes under the trademark CATIA. These CAD systems allow a user to construct and manipulate complex three dimensional (3D) models, or two dimensional (2D) models, of objects or assemblies of objects. CAD systems thus provide a representation of modeled objects using edges or lines, in certain cases with faces. Lines or edges may be represented in various manners, e.g. non-uniform rational B-splines (NURBS). These CAD systems manage parts or assemblies of parts as modeled objects, which are mostly specifications of geometry. Specifically, CAD files contain specifications, from which geometry is generated, which in turn allow for a representation to be generated. Geometry and representation may be stored in a single CAD file or multiple ones. CAD systems include graphic tools for representing the modeled objects to the designers; these tools are dedicated to the display of complex objects – the typical size of a file representing an object in a CAD system being in the range of one Megabyte per part, and an assembly may comprise thousands of parts. A CAD system manages models of objects, which are stored in electronic files. 2D or 3D models created by a user with CAD software thus contain geometric objects such as points, vectors, curves, surfaces and meshes. These objects are usually represented with floating-point values as well as other data types. A floating point value is a value of a data type used to represent a number which belongs to the real numbers (in the mathematical sense). One of the most widely used standard format for floating point values is the double-precision floating point defined in the IEEE 754 format standard, more particularly the IEEE 754-1985. ~ In this format, a floating point value a representing real number a is defined by a sign, an exponent and a mantissa on 64 bits. If a is a 64-bit floating point value in the IEEE 754 standard, we can write a = (s, e, m) with the following components: the sign s (integer coded on 1 bit), the exponent e (integer coded on 11 bits), and the mantissa m (integer coded on 52 bits). Then, by definition of the standard, if 0 < e < 211 -1, a is said to be normalized and represents the real number with bias = 211-1 -1=1023. If e=0 and m=0, then a is ~ said to be a zero and represents the real number a = 0 . If e=0 and m is different from 0, then a is said to be denormalized and represents the real number with bias = 211-1 -1=1023. If e=211-1, then a is said to be invalid and does not represent any number. A basic functionality provided by a CAD software is the ability to store on a persistent support the models created or modified by the user during a first session, and to allow these models to be reopened later for further use (e.g. in a file on the local disk, or on a server). The models can for instance be opened later in a second session of the same software, albeit with a different version of this software or on another platform. The platforms can differ in terms of hardware (different CPU) or in terms of software (different language compiler or interpreter). The model in the second session after opening should be exactly the same as the model in the first session before storing. Therefore, the storing must be lossless (i.e. involve no loss of information) and stable across different platforms (i.e. such that the opening of the model provides the same result on different platforms which support the data types used to define the model). Stability issues appear when the storing and the reopening of the model involve transforming the data stored to define the model (e.g. by compressing and decompressing the data), particularly if the transformation of the model involves arithmetic operations. Indeed, different platforms provide different results for the same operations depending on the data type. For example, if a, b and c are floating point values, then some platforms will compute the operation a+b+c as (a+b)+c while some other platforms will compute the same operation as a+(b+c), which will not necessarily lead to the same result. Moreover, floating point arithmetic involves intermediaries to perform the computations. These intermediaries do not have the same bit length on different platforms, which leads to different results. Thus, the same floating point operations performed on different platforms can lead to different results although the operations are performed on the same data. The document “What every scientist should know about floating point arithmetic”, ACM Computing Surveys, Vol.23, No 1, March 1991, by David Goldberg, presents issues related to operations on floating point values. In the following, stability will be said to be ensured for an operation (or a series of operations) if the operation(s) leads (lead) to the same result on any regular platform. Models can be stored on the persistent support in a straightforward implementation, i.e. without compression. In this implementation, the floating point values and other data defining a given geometric object are stored as such. This straightforward method (i.e. without compression) is notably used in CATIA and in other CAD software. With such a method, the storing is lossless. Indeed, the data defining the model is not modified before storing, and there can therefore not be any loss of data. The storing is also stable. Indeed, the data defining the model is not to be transformed when the model is reopened, because the data is not compressed. However, such a method fails to optimize the storage size of a CAD model. In the field of data compression in general, delta-encoding is a way of compressing data that represents a target object by storing the difference between the target object and a known reference object, instead of the target object itself. This is advantageous if the difference can be stored in less space than the target object. Predictive encoding is a variant of delta-encoding where the reference is not an actual object taken among the data but is computed from one or several actual objects using a predictor function. Thus, the predictor function predicts a reference object from actual objects. Instead of storing the target object as such (i.e. without compression), the difference between the predicted reference object and the target object is stored. The closer the prediction is to the current object, the smaller the difference is, and therefore the less storage space it takes to store the difference. The efficiency of the compression thus depends on the accuracy of the prediction. Quantization is another compression technique. Quantization is used for the compression of data comprising floating point values, possibly in combination with delta-encoding or predictive encoding. Quantization is the process of mapping floating point values to integers. Quantization produces a loss of data as it involves truncating the least significant bits of some floating point values. The article “Higher Bandwith X” (1994) and the Ph.D. Thesis “Compressing the X Graphics Protocol” (1995) by John Danskin describe a way of compressing geometry using “relative coordinates”, which is a form of delta-encoding. However the geometry is defined with integer coordinates. The method is therefore inappropriate for CAD models of which geometry are defined with a higher level of precision. The article “Geometry Compression” (1995) by Michael Deering describes the compression of triangular mesh using quantization of floating-point numbers and delta-encoding between neighbors. The article “Triangle Mesh Compression” (1998) by Costa Touma and Craig Gotsman also describes the compression of triangular mesh, but using quantization and predictive encoding. The articles “Geometric Compression Through Topological Surgery” (1998) by Gabriel Taubin and Jarek Rossignac and “Compressing Polygon Mesh Geometry with Parallelogram Prediction.” (2002) by Martin Isenburg and Pierre Alliez describe a similar approach with a different prediction scheme. The prediction is computed by a linear combination of other points in the mesh. All these methods notably present the shortcoming of producing a lossy compression because of quantization. The article “Out-of-core Compression and Decompression of Large n-dimensional Scalar Fields” (2003) by L. Ibarria et al. describes a prediction encoding method for floating point data. The prediction function involves floating point arithmetic computation and has thus the shortfall that stability is not guaranteed across different platforms. Indeed, as mentioned above, floating point arithmetic computations do not produce the same result on different platforms. The article “Lossless Compression of Floating-Point Geometry” (2004) by Isenburg et al. describes a prediction encoding method for floating point data. As above, stability is not guaranteed across different platforms. US patents US5793371, US5825369, US5842004, US5867167, US5870094, US5905502, US5905507, US5933153, US6047088, US6167159, US6215500, US6239805, US6522327, US6525722, and US6532012 describe similar methods and none of these documents addresses the issue of stability. The article “Lossless Compression of High-volume Numerical Data from Simulations” (2000) by Engelson et al. describes compression of floating point values. The values do not represent geometric objects, but rather are a sequence of values that changes smoothly and are parameterized by a given variable. The article provides an example of a sequence of three values (a1, a2, a3) which is a linear growing sequence (i.e. a3 – a2 ≈ a2 – a1). A simple prediction encoding scheme for this sequence would be to take ap3 = a2 + a2 – a1 as the prediction for a3. The difference between the predicted value and actual value is ∆2a3 = a3 - ap3 = (a3 – a2) – (a2 – a1). As the sequence is linear, the prediction is good and ∆2a3 is small: (a1, a2, ∆2a3) can be stored, which takes less storage size than the original sequence. A problem, noted by the article, is that the computation of the prediction involves floating point arithmetic operations, which prevents stability on different platforms. The article thus introduces the notion of the integer representation of a floating-point number. If p is a floating-point 64-bit number, its integer representation Int(p) is defined as an integer that is represented by the same 64-bit string as p. The integer representations are defined as b1 = Int(a1), b2 = Int(a2), b3 = Int(a3). The compression then applies the prediction-encoding described above on the sequence (b1, b2, b3) and stores (b1, b2, ∆2b3) with ∆2b3 = (b3 – b2) – (b2 – b1). With the aim of guaranteeing stability, the document thus suggests performing the following steps: converting m consecutive floating point values into their integer representations and computing a sequence of classical integer subtractions on these integer representations. However with some numerical values, the method of Engelson et al. is totally inefficient. For example, using the notations of the document, the sequence of floating point values (a1= 1.5, a2= 2.0, a3= 2.5) is considered (the floating point values are here referred to by the real number that they represent). This sequence is linear and the prediction-encoding scheme described above should theoretically be applied very efficiently on these floating point numbers, using floating point arithmetic, because the difference ∆2a3 is exactly 0. If one however applies the method of the article, then ∆2b3=(b3–b2)–(b2–b1) has 51 significant bits. This is due to the fact that the integer difference between the integer representations of two floating point values does not only depend on the floating point difference between the two floating point values but also on the values of the two floating points themselves. In the above example, b3–b2 is different from b2–b1 because the floating point representations of 2.5 and 2.0 have the same exponent but the floating point representation of 1.5 does not have the same exponent. The efficiency of the compression is therefore not satisfying. Thus, a first shortfall of this method is that it does not work well enough (i.e. the compression rate is not high enough) on some types of sequences. The article further states: “The fixed step difference algorithm works well if the sequence Int(a) can be approximated by polynomials”. In real applications, however, it would be better to assume that a can be approximated by polynomials.” Thus in real applications the method of performing integer difference on the sequence Int(a) would lead to bad predictions. Another shortfall is that the computation of ∆ can only use subtraction, not other operations. For example the multiplication and division cannot be applied to the integer representations. In other words, the difference between the integer representation of two floating point numbers may be representative of the difference between, the two floating point numbers on some cases (with at least the exception described earlier), but the multiplication (or division, or addition) of the integer representations is not representative of the multiplication (or division, or addition) between the two floating point numbers. For the addition for example, this is notably because the exponents of the two floating point values transformed in their integer representation would be added. This severely limits the prediction schemes that can be used. Thus, the prediction accomplished is not as accurate as possible and the compression rate is impacted. The article “Fast Lossless Compression of Scientific Floating-Point Data” (2006) by Ratanaworabhan et al. and the article “Fast and Efficient Compression of Floating-Point Data” (2006) by Peter Lindstrom and Martin Isenburg describe similar techniques with the same shortfalls. It is an aim of the invention to provide a method which is suitable to efficiently reduce the storage size of a CAD file. Such a solution would reduce the cost of the storage infrastructure and increase the speed of sending or receiving CAD models over a network. SUMMARY OF THE INVENTION This aim is achieved with a computer-implemented method for compressing numerical data comprising a structured set of floating point actual values, a floating point value being defined by a sign, an exponent and a mantissa, the method comprising computing a floating point predicted value related to a target actual value of the set, the computing including performing operations on integers corresponding to the sign, to the exponent and/or to the mantissa of actual values of a subset of the set; storing a bit sequence representative of a difference between integers derived from the target actual value and the predicted value. Preferred embodiments may comprise one or more of the following features: - The steps of computing and storing are iterated according to an order of the set; - At least one beginning actual value of the set is stored and the step of computing is iterated over all the other actual values of the set, and, at each iteration, is followed by steps of comparing the computed predicted value of the iteration to a threshold, storing: a bit sequence representative of a difference between integers derived from the target actual value of the iteration and the predicted value of the iteration, when the predicted value is higher than the threshold, or the target actual value of the iteration, when the predicted value of the iteration is lower than the threshold; - The actual values of the set are coordinates associated to a geometric object, preferably coordinates of control points of the geometric object; - The actual values of the subset are coordinates of control points of the geometric object neighboring another control point of the geometric object, the target actual value being a coordinate of said another control point; - The predicted value is determined according to at least one parameter associated to said another control point, the actual values of the subset, and at least one parameter associated to each control point neighboring said another control point; - The at least one parameters are determined according to a respective knot vector of the geometric object; - The geometric object is a NURBS surface and the at least one parameters are the Gréville parameters; - The integers derived from the target actual value and the predicted value are the integers defined by the strings which respectively define the target actual value and the predicted value; - The bit sequence representative of the difference comprises a prefix bit sequence indicative of a number of significant bits, and a body bit sequence equal to the difference to which the leading zeros, and preferably the first bit equal to one, are cut and of which size is the number of significant bits; - The operations comprise arithmetic operations including integer addition, subtraction, multiplication and/or division and/or bitwise operations including bits shifts and/or leading zero count. This aim is also achieved with a computer-implemented method for decompressing numerical data, which may be compressed according to the above method, the method generally comprising: computing a floating point predicted value, the computing including performing operations on integers corresponding to the sign, the exponent and/or the mantissa of the actual values of a set, and deriving a floating point actual value from the addition of a bit sequence with an integer derived from the predicted value. The decompressing method may thus be for decompressing numerical data, wherein the numerical data are a compressed form of data comprising a structured set of floating point actual values, a floating point value being defined by a sign, an exponent and a mantissa, and wherein the numerical data include a bit sequence representative of a difference between integers derived from a target actual value of the set and a floating point predicted value related to the target value. The method then comprises computing the predicted value, the computing including performing operations on integers corresponding to the sign, the exponent and/or the mantissa of actual values of a subset of the set, and deriving the target actual value from the addition of the bit sequence with an integer derived from the predicted value. In embodiments, the decompressing method may comprise any or a combination of the following features: - The steps of computing and deriving are iterated according to an order of the set; - At each iteration, the subset comprises target values derived at former iterations and/or target values included in the numerical data as such; - The numerical data further include at least one beginning actual value (stored as such) of the set and the step of computing is iterated over all the other actual values of the set, and, at each iteration, the step of computing is followed by a step of comparing the computed predicted value of the iteration to a threshold, and a step of: deriving the target actual value from the addition of a bit sequence (included in the numerical data) with an integer derived from the predicted value (as above), when the predicted value is higher than the threshold, or retrieving a target actual value (included as such in the numerical data), when the predicted value of the iteration is lower than the threshold; - The actual values of the set are coordinates associated to a geometric object, preferably coordinates of control points of the geometric object; - The actual values of the subset are coordinates of control points of the geometric object neighboring another control point of the geometric object, the target actual value being a coordinate of said another control point; - The predicted value is determined according to at least one parameter associated to said another control point, the actual values of the subset, and at least one parameter associated to each control point neighboring said another control point; - The at least one parameters are determined according to a respective knot vector of the geometric object; - The geometric object is a NURBS surface and the at least one parameters are the Gréville parameters; - The integers derived from the target actual value and the predicted value are the integers defined by the strings which respectively define the target actual value and the predicted value; - The bit sequence representative of the difference comprises a prefix bit sequence indicative of a number of significant bits, and a body bit sequence equal to the difference to which the leading zeros, and preferably the first bit equal to one, are cut and of which size is the number of significant bits; - The operations comprise arithmetic operations including integer addition, subtraction, multiplication and/or division and/or bitwise operations including bits shifts and/or leading zero count. This aim is also achieved with a computer-aided design system comprising: a database storing an object modeled by numerical data comprising a structured set of floating point actual values, a graphical user interface, and means for compressing numerical data according to the above compression method and/or decompressing numerical data according to the above decompression method. This aim is also achieved with a computer program comprising instructions for execution by a computer, the instructions comprising means for performing any of the above methods. This aim is also achieved with a computer readable storage medium having recorded thereon the above computer program. Further features and advantages of the invention will appear from the following description of embodiments of the invention, given as non-limiting examples, with reference to the accompanying drawings listed hereunder. BRIEF DESCRIPTION OF THE DRAWINGS Figures 1-4 show flowcharts of examples of a compression method, Figures 5-12 show examples of the compression of NURBS, Figure 13 show an example of a decompression method, and Figure 14 shows an example of a client computer system suitable for performing the method of the invention. DETAILED DESCRIPTION OF THE INVENTION The invention is related to a computer-implemented method for processing data. More specifically, the invention is related to a computer-implemented method for compressing numerical data which comprise a structured set of floating point actual values. The method includes computing a floating point predicted value related to a target actual value of the set. The computing includes performing operations on integers. The integers correspond to the sign, to the exponent and/or to the mantissa of actual values of a subset of the set. The method also comprises storing a bit sequence representative of a difference between integers, the said integers being derived from the target actual value and the predicted value. Such a computer-implemented method allows an efficient compression of numerical data which comprises a structured set of floating point actual values, the compression being lossless and stable across different platforms supporting the floating point data type. Such a method is suitable to efficiently reduce the storage size of a file which contains numerical data which comprise a structured set of floating point actual values. Because it is lossless and stable across different platforms, the method can be applied in particular to a CAD file containing specifications of a 2D or 3D CAD model. Indeed, CAD models are specified by numerical data which comprise a structured set of floating point actual values. The method can thus efficiently compress such a CAD file, thereby reducing the storage size of the CAD file. By “efficiently” compress a file, it is meant the compression rate is statistically higher than 20%. A compression method is said to be more efficient to another if it statistically accomplishes a higher compression rate. The method can thus reduce the cost of the storage infrastructure and increase the speed of sending or receiving CAD models over a network. The method is however directed to the compression of numerical data in general. In other words, the method reduces the storage size of numerical data, thus enables to reduce the size of any file containing such data. The data is numerical, which means that it relates to numbers. More specifically, the method may be applied to compress any numerical data which comprise a structured set of actual floating point actual values. The floating point values may be of the data type defined by the IEEE 754 format standard. The method is however directed to other similar data types. For example, floating point values may be coded on fewer or more bits than the 64 bits of the standard. The floating point values may be of data types of a different structure, for example an exponent, a mantissa and no sign. More generally, the method applies to any numerical data type which is represented by at least two integers. However, for the sake of clarity, in the following the floating point values are considered of the IEEE 754 type. The set of actual floating point values is structured. By “structured”, it is meant that the set of numbers represented by the actual floating point values is coherent, i.e. the numbers are not completely independent from each other. This coherence is reflected on the actual floating point values of the set, which are also not completely independent from each other either. Thanks to the set being structured, the value of each of the actual floating point values of the set may be predicted from other actual floating point values of the set. The structure may depend on a kind of an object represented by the numerical data. In this case, the prediction may be performed according to a prediction scheme (possibly predetermined or selected once and for all in a first step of the method). The prediction scheme may depend on the kind of the object represented by the numerical data. Owing to the fact that, by definition, objects of a same kind are structured in a same way, the method is in this case suitable for compressing numerical data representing any object of a given kind with the same prediction scheme. An example of application of the method is on an object which is the price of a stock over time. The set of actual floating point values may represents the values of price over a time series. It is well known that the price of the stock at a time is not completely independent from the price at other times. Financial quantitative analysis use different prediction schemes e.g. based on the Brownian motion to make predictions. The same prediction scheme may be used for different stocks. Another example involves sets of actual floating point values representing the temperature of cities. Again in this case, a geographical proximity between two cities may involve a dependency between their temperatures. Thus, the floating point value representing the temperature of a city may be predicted from the floating point values representing the temperature of neighboring cities. Another example involves a force field applied to an object. The force field is represented by a set of actual floating point values, which is structured when the forces applied to the object present a spatial coherence. For example, a magnetic field generally induces a force field which presents such coherence. Another example which will be discussed in more details later is a geometric surface of a CAD model. Such a surface generally varies smoothly and the position of a point of the surface may thus be predicted according to the position of neighboring points of the surface. The method makes use of the structure of the set of floating point actual values (which structure derives from the structure of the numbers represented by the floating point values) in order to compress the numerical data. The principle is, as in predictive encoding, to first select one of the actual floating point values of the set. This value is called the “target” value because it is the value to be transformed by the method for compression. The method then computes a floating point predicted value related to the target actual value. In other words, the method computes a predicted value which is a prediction of the target, and not necessarily the actual value of the target. The computation is performed according to other actual values of the set and makes abstraction of the actual value of the target. The predicted value is itself of the floating point data type. Because the set is structured and the prediction scheme is according to this structure, the computation of a predicted value performed according to other values of the set provides a result relatively close to the actual value of the target. Thus, the difference between the predicted value and the target actual value can be coded on few bits. This idea is the one usually followed by predictive encoding. The prediction scheme may include the selection of a subset of the set, the floating point values of the subset being the basis of the computation of the predicted value. The subset may be selected according to the target value. The values of the subset represent numbers which are theoretically relevant for the prediction of the number which is represented by the target floating point value. The way to perform this selection may be provided by the kind of the object represented by the numerical data. The computing of the predicted value includes performing operations on integers. Because the operations are not performed on floating point values, there is no stability issue. Indeed, regular platforms perform integer operations with the same result (as opposed to operations on floating point values). The method is thus stable across all regular platforms as it solely performs operations on integers. The integers correspond to the sign, to the exponent and/or to the mantissa of actual values of a subset of the set. The subset comprises the actual values which are used to compute the predicted value of the target actual value. The method reads, at least for the actual values used for the prediction, the bit slots reserved for the sign, for the exponent and/or for the mantissa and interprets what is read as an integer. Any way of interpretation is within the scope of the method. For example, the method may interpret the mantissa coded on 52 bits as a 64-bits integer. The integer corresponding to the mantissa may for example be a 64-bits integer whose first 12 bits are equal to 0 and whose last 52 bits are the bits of the mantissa. Inversely, the integer corresponding to the mantissa may for example be a 64-bits integer whose last 12 bits are equal to 0 and whose first 52 bits are the bits of the mantissa. Longer integers may also be used. In general, the interpretation of the sign, the exponent and/or the mantissa depends on next steps of the prediction scheme (i.e. the sequence of operations performed to compute the predicted value) and on the precision required. As the integers correspond to the sign, the exponent and/or the mantissa of actual values of the subset, the prediction scheme takes advantage of the floating point form of the actual values. Indeed, the operations are performed on representatives of the sign, of the exponent and/or of the mantissa considered individually, instead of roughly determining one integer representative per floating point value without any consideration of the characteristics of the floating point, as in the method of Engelson et al. discussed above. Notably, operations different from integer subtraction may also be performed. For example, integer addition, multiplication, division performed on the sign, the exponent and/or the mantissa make sense. The prediction is thus refined and is more accurate. A more accurate prediction leads to a higher compression rate. Examples showing how the prediction is refined are provided later. As mentioned above, the method then comprises storing a bit sequence representative of a difference between integers, the said integers being derived from the target actual value and the predicted value. Because the set is structured, the prediction generally leads to a predicted value close to the target actual value. In this case, the difference between integers derived from the target actual value and the predicted value is small. The difference can thus be stored on a bit sequence of smaller size than the size of the target actual value. Thus, instead of storing the target actual value as such, a bit sequence of smaller size is stored. The bit sequence can then be used in a decompression step to retrieve the target actual value. Because the bit sequence is representative of a difference which is performed between integers, the stability and the lossless qualities of the compression remain. Indeed, the difference between two integers leads to the same result on all regular platforms without any loss of information. The derivation of the integers from the target actual value and the predicted value is detailed later. The above explanations concerned the general case of application of the method. Referring to figure 1, the general case includes a single application of the steps of computing S10 a predicted value predkt and of storing S20 the bit sequence Bk. As shown on the example of figure 1, the method may also include a step of providing S0 a structured set E of floating point actual values valk and a step S5 of selecting a target actual value valkt of the set to which the predicted value predkt to be computed relates, before the step of computing S10. The steps of computing S10 and storing S20 are preferably iterated (i.e. repeated), as exemplified in figure 2. At each iteration (i.e. repetition) k, the target value valkt is a new selected actual value of the set E, which has not been the target value of one of previous iterations of the method. Such iteration is performed according to an order (0,…, k-1, k, k+1,…) of the set E. The set is thus browsed (i.e. parsed) according to the order, and at each iteration k of the browsing, the selection S5 of the target value follows the order and the value valk numbered k is selected. The order may be predetermined. For example, the numerical data may comprise information regarding the order. The order may alternatively be arbitrarily determined at the start before performing the iterations. It may also be dynamically iterated during the iteration. Such a method statistically allows the global compression of the numerical data. As exemplified on figure 3, the method may store S3 at least one beginning actual value val0 of the set, for example one value val0, two values (val0 and val1), or three values (val0, val1 and val2) or more, and the step of computing S10 may be iterated over all the other actual values of the set. For example, if one beginning value val0 is stored in S3 and the set E has n values, then the step of computing S10 may be iterated on remaining values val1, val2,…,valn. Alternatively, some of the remaining values may be skipped (this is however equivalent to redefining the set to the values on which the step of computing S10 is performed). The step of storing S20 may itself be performed on all or some values on which the step of computing 10 has been performed. A beginning value is thus a value which is stored as such (i.e. uncompressed). Thus, a beginning value is not a target of an iteration k subjected to the computing S10 and storing S20 steps. When decompressing, as will be seen later, the prediction may be performed based on actual values of the subset. Thus, at least some values may be stored as such to allow the start of the decompression. Alternatively, the first predictions for actual values may be performed based on a stored means of actual values. Alternatively, the first predictions may be performed according to other parameters comprised in the numerical data. At each iteration k, the step of computing S10 may be followed by a step of comparing S15 the computed predicted value predkt of the iteration k to a threshold, as exemplified on figure 4. In this case, at a following step of storing, what is stored depends on the result of the comparison. The method either stores in a step S20 and as described earlier a bit sequence representative Bk of a difference Dk between integers derived from the target actual value valkt of the iteration and the predicted value predkt of the iteration k, when the predicted value predkt is higher than the threshold. Or the method stores in a step S20’ the target actual value valkt of the iteration, uncompressed, when the predicted value of the iteration is lower than the threshold. At each computing step S10, the predicted value predkt is computed in a way to be as close as possible to the target actual value valkt. However in many contexts it can be estimated a priori that the difference between predkt and valkt has a high probability of being too big to be compressed effectively. If the predicted value predkt is smaller than a given threshold, then there is a risk that the difference between predkt and valkt is higher than predkt, thus leading to a bit sequence Bk representative of the difference between integers derived from predkt and valkt that is higher than the number of bits used to code valkt as such. For example, if the predicted value predkt is close to 0, there is a risk that the actual valkt is of a different sign from the predicted value predkt, in which case the bit sequence Bk is too long. This will appear even more clearly when examples of such bit sequences Bk are provided. To avoid a probable and unwanted increase in encoding space, if the prediction is smaller than the threshold the number valkt itself, no attempt of compression is made. Furthermore, the overall process speed is increased as the bit sequence Bk is not determined. The threshold value may be a threshold on the value of the exponent of the predicted value predkt. In other words, the exponent of predkt is compared to the threshold. Such a threshold ensures the quickness of the step of comparison S15. Such a threshold may be predetermined and may depend on the context of application. The threshold may alternatively be determined according to the numerical data and stored in order to be reused during the decompression. The value of the threshold may take into account the rounding precision of the arithmetic operations used to compute S10 the predicted value predkt. The threshold may also be determined dynamically according to predicted values (predk-1t, predk-2t,…) previously computed and/or related actual values (valk-1t, valk-2t,…). The threshold of an iteration k typically does not depend on valkt or predkt. The comparison with a threshold renders the compression more efficient by processing the case of actual values of which predicted values have a high probability of leading to bad compression rate. The step of storing S3 at least one beginning actual value of the set is not represented on figure 4, but may be comprised in the method after the step of providing S0 the structured set E of floating point actual values valk as on figure 3. The actual values of the set may be coordinates associated to a geometric object, for example coordinates of control points of the geometric object. Numerical data comprising floating point coordinates of geometric objects make up for a high proportion of CAD files. The method is thus particularly suitable to the compression of CAD files. In an example, the geometric object is a NURBS surface. In other examples, the geometric object is a curve, e.g. a NURBS curve, or another type of curve or surface. The method applies to any geometric object in general. Examples of the method are detailed in the following with reference to NURBS, for the sake of clarity. But it must be understood that the following explanations apply to other uses of the method. A NURBS surface is a surface, widely used in CAD, defined in a 2D space with values in a 3D space: S(u, v) (x, y, z). A NURBS surface is represented with two knot vectors and a 2-dimensional array of control points, and possibly other data depending on the CAD software used. Typically, the knot vectors are arrays of floating point values. The array of control points may be ordered. Each control point is a 3D point with coordinates x, y, z. The place in the array is defined with the two indices i and j; with 0 ≤ i > n: right shift of n bits of the integer M. 10 • LZC(M): leading zero count of M (i.e. number of consecutive zeros on the left of M) • M+ = (M >> 32). • M- = (M << 32) >> 32: the 32 more significant bits of M are set to zero 15 In the following, rules for the addition are provided. As the addition is commutative, we can assume that we have If A and B have the same sign, we have: The objective is to define a 64-bits integer M that is an approximation of MExact and that can be computed using only integer arithmetic and bitwise shift, while ensuring that the real number corresponding to M is not too far from (for example ensuring convenient. It is easy to check that the computation of such M causes no overflow and that M is a good enough approximation of MExact, The method may thus compute C as: C =(SC ,EC,MC ) SC =SA EC = EA +1-LZC(M). MC = M << LZC(M) If A and B have a different sign, we have: with MExact = MA -MB *2EB -EA A similar approach leads to M = M A - (MB >> (EB - EA )) and to: C =(SC ,EC,MC ) SC =SA EC = EA -LZC(M) MC = M << LZC(M) For computing a subtraction between two operands, the method performs the addition on the operands A and (-B) as with the addition defined above. For the multiplication, we have: with MExact = MA *MB *2-64 It is easy to check that 0 < MExact ≤ 264 -1. We want to define a 64-bits integer M that is an approximation of MExact and that can be computed using only integer arithmetic and bitwise shift. We write MExact as: MA = MA+ *232 +MA- MB = MB+ *232 +MB- MExact = MA+ *MB+ +(MA+ *MB- +MB+ *MA- )*2-32 +MA- *MB- *2-64 We compute M as: M = MA+ *MB+ + ((MA+ *MB- )>>32)+ ((MB+ *MA- )>>32) The operations in the formula above can be performed using 64-bit integer arithmetic without causing overflow, because all the products have operands that have their 32 most significant bits set to null. It is also easy to see that the sum of these products fits into a 64-bits integer without overflow, because we have: M≤MExact and so M≤264-1 It is easy to check that M is a good enough approximation of MExact : We finally define C as: C = (SC,EC,MC) SC=SA+SBmod2 EC=EA+EB+1-LZC(M) MC=M<

Documents

Orders

Section	Controller	Decision Date

Application Documents

#	Name	Date
1	1434-MUM-2011-CORRESPONDENCE(31-05-2011).pdf	2011-05-31
1	1434-MUM-2011-IntimationOfGrant15-12-2022.pdf	2022-12-15
2	1434-MUM-2011-PatentCertificate15-12-2022.pdf	2022-12-15
2	Form-5.pdf	2018-08-10
3	Form-3.pdf	2018-08-10
3	1434-MUM-2011-Written submissions and relevant documents [18-10-2022(online)].pdf	2022-10-18
4	Form-13.pdf	2018-08-10
4	1434-MUM-2011-Correspondence to notify the Controller [06-10-2022(online)].pdf	2022-10-06
5	Form-1.pdf	2018-08-10
5	1434-MUM-2011-US(14)-ExtendedHearingNotice-(HearingDate-10-10-2022).pdf	2022-09-28
6	Drawings.pdf	2018-08-10
6	1434-MUM-2011-Correspondence to notify the Controller [26-09-2022(online)].pdf	2022-09-26
7	ABSTRACT 1.jpg	2018-08-10
7	1434-MUM-2011-FORM-26 [26-09-2022(online)].pdf	2022-09-26
8	1434-MUM-2011-US(14)-ExtendedHearingNotice-(HearingDate-29-09-2022).pdf	2022-07-25
8	1434-MUM-2011-OTHERS-270515.pdf	2018-08-10
9	1434-MUM-2011-FORM 26(1-7-2011).pdf	2018-08-10
9	1434-MUM-2011-US(14)-HearingNotice-(HearingDate-05-08-2022).pdf	2022-07-14
10	1434-MUM-2011-FORM 18(29-4-2014).pdf	2018-08-10
10	1434-MUM-2011-ORIGINAL UR 6(1A) FORM 26-240519.pdf	2020-01-06
11	1434-MUM-2011-ABSTRACT [21-05-2019(online)].pdf	2019-05-21
11	1434-MUM-2011-FORM 1(1-7-2011).pdf	2018-08-10
12	1434-MUM-2011-CLAIMS [21-05-2019(online)].pdf	2019-05-21
12	1434-MUM-2011-Correspondence-270515.pdf	2018-08-10
13	1434-MUM-2011-COMPLETE SPECIFICATION [21-05-2019(online)].pdf	2019-05-21
13	1434-MUM-2011-CORRESPONDENCE(29-4-2014).pdf	2018-08-10
14	1434-MUM-2011-CORRESPONDENCE(1-7-2011).pdf	2018-08-10
14	1434-MUM-2011-FER_SER_REPLY [21-05-2019(online)].pdf	2019-05-21
15	1434-MUM-2011-FORM 5.pdf	2018-12-03
15	1434-MUM-2011-FORM-26 [21-05-2019(online)].pdf	2019-05-21
16	1434-MUM-2011-FORM 3.pdf	2018-12-03
16	1434-MUM-2011-OTHERS [21-05-2019(online)].pdf	2019-05-21
17	1434-MUM-2011-PETITION UNDER RULE 137 [21-05-2019(online)].pdf	2019-05-21
17	1434-MUM-2011-FORM 2.pdf	2018-12-03
18	1434-MUM-2011-FER.pdf	2018-12-05
19	1434-MUM-2011-FORM 2.pdf	2018-12-03
19	1434-MUM-2011-PETITION UNDER RULE 137 [21-05-2019(online)].pdf	2019-05-21
20	1434-MUM-2011-FORM 3.pdf	2018-12-03
20	1434-MUM-2011-OTHERS [21-05-2019(online)].pdf	2019-05-21
21	1434-MUM-2011-FORM 5.pdf	2018-12-03
21	1434-MUM-2011-FORM-26 [21-05-2019(online)].pdf	2019-05-21
22	1434-MUM-2011-CORRESPONDENCE(1-7-2011).pdf	2018-08-10
22	1434-MUM-2011-FER_SER_REPLY [21-05-2019(online)].pdf	2019-05-21
23	1434-MUM-2011-COMPLETE SPECIFICATION [21-05-2019(online)].pdf	2019-05-21
23	1434-MUM-2011-CORRESPONDENCE(29-4-2014).pdf	2018-08-10
24	1434-MUM-2011-Correspondence-270515.pdf	2018-08-10
24	1434-MUM-2011-CLAIMS [21-05-2019(online)].pdf	2019-05-21
25	1434-MUM-2011-ABSTRACT [21-05-2019(online)].pdf	2019-05-21
25	1434-MUM-2011-FORM 1(1-7-2011).pdf	2018-08-10
26	1434-MUM-2011-FORM 18(29-4-2014).pdf	2018-08-10
26	1434-MUM-2011-ORIGINAL UR 6(1A) FORM 26-240519.pdf	2020-01-06
27	1434-MUM-2011-FORM 26(1-7-2011).pdf	2018-08-10
27	1434-MUM-2011-US(14)-HearingNotice-(HearingDate-05-08-2022).pdf	2022-07-14
28	1434-MUM-2011-OTHERS-270515.pdf	2018-08-10
28	1434-MUM-2011-US(14)-ExtendedHearingNotice-(HearingDate-29-09-2022).pdf	2022-07-25
29	1434-MUM-2011-FORM-26 [26-09-2022(online)].pdf	2022-09-26
29	ABSTRACT 1.jpg	2018-08-10
30	1434-MUM-2011-Correspondence to notify the Controller [26-09-2022(online)].pdf	2022-09-26
30	Drawings.pdf	2018-08-10
31	Form-1.pdf	2018-08-10
31	1434-MUM-2011-US(14)-ExtendedHearingNotice-(HearingDate-10-10-2022).pdf	2022-09-28
32	Form-13.pdf	2018-08-10
32	1434-MUM-2011-Correspondence to notify the Controller [06-10-2022(online)].pdf	2022-10-06
33	Form-3.pdf	2018-08-10
33	1434-MUM-2011-Written submissions and relevant documents [18-10-2022(online)].pdf	2022-10-18
34	Form-5.pdf	2018-08-10
34	1434-MUM-2011-PatentCertificate15-12-2022.pdf	2022-12-15
35	1434-MUM-2011-IntimationOfGrant15-12-2022.pdf	2022-12-15
35	1434-MUM-2011-CORRESPONDENCE(31-05-2011).pdf	2011-05-31

Search Strategy

1	1434MUM2011search2_05-12-2018.pdf