Abstract: A fused dot-product multiply-accumulate (MAC) circuit may support variable precisions of floating-point data elements to perform computations (e.g., MAC operations) in deep learning operations. An operation mode of the circuit may be selected based on the precision of an input element. The operation mode may be a FP16 mode or a FP8 mode. In the FP8 mode, product exponents may be computed based on exponents of floating-point input elements. A maximum exponent may be selected from the one or more product exponents. A global maximum exponent may be selected from a plurality of maximum exponents. A product mantissa may be computed and aligned with another product mantissa based on a difference between the global maximum exponent and a corresponding maximum exponent. An adder tree may accumulate the aligned product mantissas and compute a partial sum mantissa. The partial sum mantissa may be normalized using the global maximum exponent.
| # | Name | Date |
|---|---|---|
| 1 | 202547096552-PRIORITY DOCUMENTS [07-10-2025(online)].pdf | 2025-10-07 |
| 2 | 202547096552-POWER OF AUTHORITY [07-10-2025(online)].pdf | 2025-10-07 |
| 3 | 202547096552-FORM 1 [07-10-2025(online)].pdf | 2025-10-07 |
| 4 | 202547096552-DRAWINGS [07-10-2025(online)].pdf | 2025-10-07 |
| 5 | 202547096552-DECLARATION OF INVENTORSHIP (FORM 5) [07-10-2025(online)].pdf | 2025-10-07 |
| 6 | 202547096552-COMPLETE SPECIFICATION [07-10-2025(online)].pdf | 2025-10-07 |