Abstract: A method of implementing at least one fused operation in an embedded circuit and a control unit thereof ABSTRACT A method of implementing at least one fused operation in an embedded circuit 11. In step S1, the at least one fused operation will be run by a control unit 10 stored in a workload slot 12 of the control unit 10. In step S2, an output of the at least one fused operation is saved in the form of a custom fixed points (CFX) format in an intermediate register 14 of the control unit 10. In step S3, a final fused operation is computed from at least one output of the at least one fused operation. In step S4, a final output of the final fused operation is modified into a POSIT data type prior to saving in the control unit 10. (FIGURE 1)
Description:Complete Specification:
The following specification describes and ascertains the nature of this invention and the manner in which it is to be performed:
[0001] Field of the invention:
The invention is related to a method of implementing at least one fused operation in an embedded circuit and a control unit thereof.
[0002] Background of the invention:
Most of the application specific workloads like Artificial Intelligence (AI) or Digital Signal Processing (DSP) workloads, uses large series of multiply and accumulate (MAC) operations. To execute this series of MAC operations most of the modern-day decimal rounding of intermediate MAC results till last MAC operation. This a done to enhance the accuracy of final result. To implement the fused operation for POSIT datatype, very large fixed-point registers called Quires are used. The huge register bit length of Quires makes it difficult to integrate them on to an embedded processor because of higher on-chip area and power dissipation. The huge bit length of Quire also drastically effects the cache performance of the embedded devices thereby degrading the speed of hardware. Especially in the embedded application of Bosch the size of the Posit Quires
[0003] Brief description of the accompanying drawings:
An embodiment of the disclosure is described with reference to the following accompanying drawing,
[0004] Figure 1 illustrates a control unit for implementing at least one fused operation in an embedded circuit ; and
[0005] Figure 2 illustrates a flow chart of a method of implementing at least one fused operation in the embedded circuit according to the present invention.
Detailed description of the embodiments:
[0006] Figure 1 illustrates a control unit for implementing at least one fused operation in an embedded circuit 11 according to one embodiment of the invention. The control unit 10 runs at least one fused operation stored in a workload slot 12 of the control unit 10. The control unit 10 saves an output of the at least one fused operation in the form of a custom fixed points (CFX) format in an intermediate register 14 of the control unit 10. The control unit 10 computes a final fused operation from at least one output of the at least one fused operation. The modifies a final output of a final fused operation into a POSIT data type prior to saving in the control unit 10.
[0007] Further the construction of the device and the components of the device 10 is explained in detail. The control unit is chosen from a group of control units comprising a microprocessor, a micro controller, a digital circuit, an ASIC and the like. The control unit 12 comprises a workload slot 14 wherein the slot has a methodology associated with at least one fused operation. Ie., the workload comprises multiple fused operations. For example, the workload stored in the control unit slot 12 is used for detecting an object. According to one embodiment of the invention, the fused operation is a fused multiple and add (FMA) function.
[0008] According to another embodiment of the invention, the multiple fused operations is any one of a mathematical function. In the conventional methods, the outputs of the fused operation is stored in a POSIT format that is known to a person skilled in the art. Due to the saving of this kind of format, the extended version of the output will be removed and there is a high chance of generating an error in the final output. To avoid this , the present invention, provides a solution in saving the outputs of the fused operations in a custom fixed point (CFX) format.
[0009]According to one embodiment of the invention, the CFX is an intermediate register 14 that is present in the control unit 10. The methodology that is present in the workload slot 12 comprises multiple intermediate fused operations and a final fused operation. one fused operation is performed in one iteration during a run of data in the workload. The outputs of the intermediate fused operations are stored in CFX format and the output of the final fused operation is stored in the form of a POSIT format. The CFX format is a predefined extended version of saving the output of the at least one fused operation. For instance, if the fused operation is an addition of two numbers x1= 1.2345 and x2=2.3456, then the output of the fused operation which is t1 = x1+x2 =1.2345+2.3456 =3.5801. In CFX format, the output is saved in the extended format.
[0010]A method of implementing at least one fused operation in an embedded circuit. In step S1,the at least one fused operation will be run by a control unit 10 stored in a workload slot 12 of the control unit 10. In step S2, an output of the at least one fused operation is saved in the form of a custom fixed points (CFX) format in an intermediate register 12 of the control unit 10. In step S3, a final fused operation is computed from at least one output of the at least one fused operation. In step S4,a final output of the final fused operation is modified into a POSIT data type prior to saving in the control unit 10.
[0011] The method is explained in detail. The workload slot present in the control unit comprises different methodologies/algorithms that are used for various applications. One such application is an object detection method. It is to be noted that, the application can be of any other that is known to a person skilled in the art. The methodology uses multiple numbers and various types of those numbers. For instance, the numbers can be of real numbers, integers, natural numbers, decimals and the like. The methodology further comprises multiple mathematical functions which in the present invention referred as the fused operations.
[0012] The mathematical functions can be addition, subtraction, multiplication and division. According to one embodiment of the invention, the fused operation is a fused addition and multiplication function. However, it can be any other mathematical function that is known in the state of the art. The methodology/algorithm present in the workload slot 12 comprises multiple fused operations which are referred as intermediate fused operations and the outputs of those operations are referred as intermediate results/outputs. And there will be one final fused operation and a final output.
[0013]When the methodology in the workload is activated/run by the control unit 10, multiple operations will be performed. Each of the fused operation will be performed in each iteration . For example, the workload slot comprises x1, x2 , x3 and x4 numbers and the values of those numbers are x1 = 1.123, x2 = 1.320, x3 =1.450, x4 = 1.870 respectively. The control unit 10 has to perform an addition and a multiplication using the fused multiplication and addition operation. Ie, the multiplication of x1 and x2 and multiplication of x3 and x4.
(x1*x2) =1.123 * 1.320 = 1.482360.
[0014] The numbers x1 and x2 are stored in POSIT format and the result/output of that operation is stored in an extended format which is a CFX format.
The same process of saving will happen with the numbers x3 and x4. The x3 and x4 are stored in the POSIT format and the result/output of the operation is stored in an extended format which is the CFX format.
(x3*x4) = (1.450 * 1.870 ) = 2.711500
[0015]The result of the first multiplication 1.482360 which is in CFX format is stored in a CFX buffer 15 present in the control unit 10. The outputs of these intermediate fused operations is saved in CFX buffer simultaneously. Ie.,
x3*x4 = 2.711500 +CFX buffer (the output of the first result is stored).
In this, the final fused operation will be
(x1*x2) + (x3*x4) = 2.711500 + 1.48236 (both stored in CFX format)
and the output of the fused operation will be 4.193860 which is also in the CFX format. The CFX buffer 15 is present in the intermediate register /CFX register 14 for saving the outputs of the intermediate fused operations.But in the present invention, the control unit 10 modifies the final output of the final fused operation from the CFX format to the POSIT format and saves it in the memory. This is to reduce the space in the memory.
[0016] And also, size of the intermediate register 14 which is also called as CFX is based on the methodology present in the workload slot 12 of the control unit 10. The control unit 10 in order to find the range of the CFX operational results , saves the minimum and the maximum values that are obtained in those mathematical functions/fused operations. The maximum slot saves the smallest POSIT value and the minimum slot saves the maximum POSIT value. With this, the range of the CFX results will be understood to find the size of the CFX register 14.
[0017] With the above method, the errors in computing the outputs of the fused operations will be reduced and also the method provides a faster result when compared to the conventional one’s.
[0018] It should be understood that embodiments explained in the description above are only illustrative and do not limit the scope of this invention. Many such embodiments and other modifications and changes in the embodiment explained in the description are envisaged. The scope of the invention is only limited by the scope of the claims.
, Claims:We claim: -
1. A control unit (10) for implementing at least one fused operation in an embedded circuit (11), said control unit (10) adapted to :
- run said at least one fused operation stored in a workload slot (12) of said control unit (10);
- save an output of said at least one fused operation in the form of a custom fixed points (CFX) format in an intermediate register (14) of said control unit (10);
- compute a final fused operation from at least one output of said at least one fused operation;
- modify a final output of a final fused operation into a POSIT data type prior to saving in said control unit(10).
2. The control unit (10) as claimed in claim 1, wherein said workload slot comprises multiple fused operations having numbers chosen from a group of numbers comprising real numbers, integers, natural numbers, decimal numbers and the like.
3. The control unit (10) as claimed in claim 2, wherein said multiple fused operations are any one of a mathematical function.
4. The control unit (10) as claimed in claim 1, wherein said CFX format is a predefined extended version of saving said output of said at least one fused operation.
5. The control unit (10) as claimed in claim 1, wherein one fused operation is performed in one iteration during a run of data in said workload slot (12).
6. The control unit (10) as claimed in claim 1, wherein said fused operation is a fused multiply and add function.
7. The control unit (10) as claimed in claim 1, wherein said final output is modified from said CFX format into said POSIT format by a conversion circuit present in said control unit (10).
8. The control unit (10) as claimed in claim 1, wherein a size of said intermediate register (14) is based on a range of said data of said workload.
9. A method of implementing at least one fused operation in an embedded circuit (11), said method comprising :
- running said at least one fused operation by a control unit (10) stored in a workload slot (12) of said control unit (10);
- saving an output of said at least one fused operation in the form of a custom fixed points (CFX) format in an intermediate register (14) of said control unit (10);
- computing a final fused operation from at least one output of said at least one fused operation;
- modify a final output of said final fused operation into a POSIT data type prior to saving in said control unit (10).
| # | Name | Date |
|---|---|---|
| 1 | 202241043468-POWER OF AUTHORITY [29-07-2022(online)].pdf | 2022-07-29 |
| 2 | 202241043468-FORM 1 [29-07-2022(online)].pdf | 2022-07-29 |
| 3 | 202241043468-DRAWINGS [29-07-2022(online)].pdf | 2022-07-29 |
| 4 | 202241043468-DECLARATION OF INVENTORSHIP (FORM 5) [29-07-2022(online)].pdf | 2022-07-29 |
| 5 | 202241043468-COMPLETE SPECIFICATION [29-07-2022(online)].pdf | 2022-07-29 |
| 6 | 202241043468-FORM 18 [11-11-2024(online)].pdf | 2024-11-11 |