Abstract: Activations (e.g., output activations) or weights of intermediate layers of deep neural networks (DNNs) can be pruned to increase sparsity and reduce the amount of computation required for performing the computations in the layers or subsequent layers. A pruning threshold may be determined, e.g., through an iterative process, and activations or weights having absolute values lower than the pruning threshold may be changed to zero. A first pruning threshold may be used to prune an output tensor or kernel of a layer. The loss in the accuracy of the DNN due to the pruning may be determined. A second pruning threshold may be determined based on the first pruning threshold and the accuracy loss. The DNN may be modified by adding a pruning operation to the layer. The pruning operation can prune output tensors or kernels of the layer based on the second pruning threshold.
| # | Name | Date |
|---|---|---|
| 1 | 202547117704-PRIORITY DOCUMENTS [26-11-2025(online)].pdf | 2025-11-26 |
| 2 | 202547117704-POWER OF AUTHORITY [26-11-2025(online)].pdf | 2025-11-26 |
| 3 | 202547117704-FORM 1 [26-11-2025(online)].pdf | 2025-11-26 |
| 4 | 202547117704-DRAWINGS [26-11-2025(online)].pdf | 2025-11-26 |
| 5 | 202547117704-DECLARATION OF INVENTORSHIP (FORM 5) [26-11-2025(online)].pdf | 2025-11-26 |
| 6 | 202547117704-COMPLETE SPECIFICATION [26-11-2025(online)].pdf | 2025-11-26 |