Abstract: An apparatus to facilitate utilizing structured sparsity in systolic arrays is disclosed. The apparatus includes a processor comprising a systolic array to receive data from a plurality of source registers, the data comprising unpacked source data, structured source data that is packed based on sparsity, and metadata corresponding to the structured source data; identify portions of the unpacked source data to multiply with the structured source data, the portions of the unpacked source data identified based on the metadata; and output, to a destination register, a result of multiplication of the portions of the unpacked source data and the structured source data.
Claims:1. An apparatus comprising:
a processor comprising a systolic array to:
receive data from a plurality of source registers, the data comprising unpacked source data, structured source data that is packed based on sparsity, and metadata corresponding to the structured source data;
identify portions of the unpacked source data to multiply with the structured source data, the portions of the unpacked source data identified based on the metadata; and
output, to a destination register, a result of multiplication of the portions of the unpacked source data and the structured source data.
, Description:FIELD
[0002] This disclosure relates generally to data processing and more particularly to utilizing structured sparsity in systolic arrays.
BACKGROUND OF THE DISCLOSURE
[0003] Neural networks and other types of machine learning models are useful tools that have demonstrated their value solving complex problems regarding pattern recognition, natural language processing, automatic speech recognition, etc. Neural networks operate using artificial neurons arranged into one or more layers that process data from an input layer to an output layer, applying weighting values to the data during the processing of the data. Such weighting values are determined during a training process and applied during an inference process.
[0004] Sparsity is a property of the data received by an execution unit. Sparsity can be capitalized upon to improve the performance of some arithmetic and logic operations. Sparsity refers to the amount of values being zeroes among the data used in a series of operations. It is recognized that multiplications when operated with zeros, give a zero as a result. If the result of these operations is known, the operations are not computed and execution time can be saved.
[0005] An instruction that computes dot matrices multiplication in a systolic array is often used in machine learning (ML) algorithms to execute neural networks. In these workloads, usually the weights and the activations of layers of the neurons are represented as matrices and multiplied. The weights have a high probability of having many sparse values when they are computed from a function (e.g., a RELU function) whose output is zero for any negative input.
BRIEF DESCRIPTION OF THE DRAWINGS
[0006] So that the manner in which the above recited features of the present embodiments can be understood in detail, a more particular description of the embodiments, briefly summarized above, may be had by reference to embodiments, some of which are illustrated in the appended drawings. It is to be noted, however, that the appended drawings illustrate typical embodiments and are therefore not to be considered limiting of its scope. The figures are not to scale. In general, the same reference numbers will be used throughout the drawing(s) and accompanying written description to refer to the same or like parts.
[0007] FIG. 1 is a block diagram of an example computing system that may be used to utilize structured sparsity in systolic arrays, according to implementations of the disclosure.
[0008] FIGS. 2A and 2B illustrate example depictions of matrix multiplication, in accordance with implementations of the disclosure.
[0009] FIG. 3 illustrates a packing process for a row of a sub-matrix, in accordance with implementations of the disclosure.
[0010] FIG. 4 illustrates an example data packing case using half float elements, in accordance with implementations of the disclosure.
[0011] FIGS. 5A and 5B depict examples of unpacked data converted into corresponding packed data and metadata, in accordance with implementations of the disclosure.
[0012] FIG. 6 illustrates an example computing environment implementing a systolic array that utilizes structured sparsity, in accordance with implementations of the disclosure.
[0013] FIG. 7 illustrates a schematic of operations of the systolic array when structured sparsity is provided in the index, according to implementations of the disclosure.
[0014] FIG. 8 is a flow diagram illustrating a method for utilizing structured sparsity in systolic arrays, in accordance with implementations of the disclosure.
[0015] FIG. 9 is a flow diagram illustrating a method for performing matrix multiplication in systolic arrays utilizing structured sparsity, in accordance with implementations of the disclosure.
[0016] FIG. 10 is a schematic diagram of an illustrative electronic computing device to enable utilization of structured sparsity in systolic arrays, according to some embodiments.
DETAILED DESCRIPTION
[0017] Implementations of the disclosure describe utilizing structured sparsity in systolic arrays. Today’s computing systems are expected to deliver near zero-wait responsiveness and superb performance while taking on large workloads for execution. Therefore, computing architectures have continually changed (e.g., improved) to accommodate large (and often demanding) workloads and increased performance expectations.
| # | Name | Date |
|---|---|---|
| 1 | 202144047611-FORM 1 [20-10-2021(online)].pdf | 2021-10-20 |
| 2 | 202144047611-DRAWINGS [20-10-2021(online)].pdf | 2021-10-20 |
| 3 | 202144047611-DECLARATION OF INVENTORSHIP (FORM 5) [20-10-2021(online)].pdf | 2021-10-20 |
| 4 | 202144047611-COMPLETE SPECIFICATION [20-10-2021(online)].pdf | 2021-10-20 |
| 5 | 202144047611-FORM-26 [18-01-2022(online)].pdf | 2022-01-18 |
| 6 | 202144047611-FORM 3 [18-04-2022(online)].pdf | 2022-04-18 |
| 7 | 202144047611-FORM 3 [17-10-2022(online)].pdf | 2022-10-17 |
| 8 | 202144047611-FORM 18 [28-10-2024(online)].pdf | 2024-10-28 |