Sign In to Follow Application
View All Documents & Correspondence

Techniques To Detect Fusible Operators With Machine Learning

Abstract: Various embodiments are generally directed to techniques to detect fusible operators with machine learning, such as by evaluating a set of operators in a graph of a machine learning model to identify fusion candidates comprising subgraphs of the graph with two or more operators to combine, for instance. Some embodiments are particularly directed to utilizing a machine learning classifier to evaluate fusion candidates using a set of features of the fusion candidate.

Get Free WhatsApp Updates!
Notices, Deadlines & Correspondence

Patent Information

Application #
Filing Date
08 December 2020
Publication Number
50/2020
Publication Type
INA
Invention Field
COMPUTER SCIENCE
Status
Email
Parent Application
Patent Number
Legal Status
Grant Date
2025-01-30
Renewal Date

Applicants

INTEL CORPORATION
2200 Mission College Blvd., Santa Clara, California 95054

Inventors

1. YAO, Weifeng
No. 880, Zi Xing Road, Shanghai 200241
2. HU, Xiao
No. 880 Zi Xing Road Shanghai 200241
3. MA, Hongpeng
Room 2902, South Qinzhou Rd., Xuhui District Shanghai 200235
4. LIU, Yanan
Room 101, Building 20 No. 399 Zhujiagang Road, Pudong New District Shanghai 201318
5. ZHOU, Huan, H.
No. 880 Zi Xing Road, Minhang District Shanghai 200240
6. YU, Xiaokun
880 Zixing Road Zizhu Science Park, Minhang District Shanghai 200240
7. LU, Zijie
Room 502, 5 Songnansicun, Baoshan District Shanghai 200441

Specification

1. An apparatus, comprising:
a processor; and
a memory comprising instructions that when executed by the processor cause the processor to:
identify input comprising one or more machine learning models that each include a graph of operators;
mine the one or more machine learning models based on one or more operational parameters to determine one or more fusion candidates, each of the one or more fusion candidates comprising a subgraph of at least one graph of operators, wherein each subgraph includes two or more operators;
extract a feature set from each of the one or more fusion candidates; utilize a machine learning classifier to evaluate the one or more fusion candidates based on the feature sets extracted from each of the one or more fusion candidates; and provide, as output, a proposed candidate of the one or more fusion candidates to fuse based on evaluation of the one or more fusion candidates.
2. The apparatus of claim 1, the memory comprising instructions that when executed by the processor cause the processor to combine each operator in the subgraph of the proposed candidate to fuse the proposed candidate into a fused candidate.
3. The apparatus of claim 2, the memory comprising instructions that when executed by the processor cause the processor to evaluate computational efficiency of a first machine learning model with the proposed candidate and a second machine learning model with the fused candidate to validate the proposed candidate.
4. The apparatus of claim 3, the memory comprising instructions that when executed by the processor cause the processor to utilize compiler stacks to evaluate computational efficiency of the first and second machine learning models.

5. The apparatus of claim 3, the memory comprising instructions that when executed by the processor cause the processor to utilize a tensor virtual machine (TVM) to evaluate computational efficiency of the first and second machine learning models.
6. The apparatus of claim 1, the machine learning model comprising a deep neural network (DNN) model and each operator includes a layer in the DNN model.
7. The apparatus of claim 1, the memory comprising instructions that when executed by the processor cause the processor to rank each of the one or more fusion candidates based on the feature sets to identify the proposed candidate.
8. The apparatus of claim 1, wherein the feature set includes the one or more operational parameters.
9. The apparatus of claim 1, wherein the one or more operational parameters include one or more of a frequency of utilization, a computational cost, and a memory cost.
10. The apparatus of claim 1, the memory comprising instructions that when executed by the processor cause the processor to utilize weighted frequent subgraph mining to mine the one or more machine learning models based on the one or more operational parameters to determine the one or more fusion candidates.
11. The apparatus of claim 10, the memory comprising instructions that when executed by the processor cause the processor to generate an edge weight metric based on the one or more operational parameters to mine the one or more machine learning models.
12. The apparatus of claim 1, each feature set comprising one or more core features and one or more uncore features.

13. The apparatus of claim 12, the core features comprising one or more of instructions retired, elapsed core clock ticks, core frequency, L2 cache hits and misses, and L3 cache hits and misses.
14. The apparatus of claim 12, the uncore features comprising one or more of read bytes from memory controllers, bytes written to memory controllers, and data traffic transferred via interconnect links.
15. At least one non-transitory computer-readable medium comprising a set of instructions that, in response to being executed by a processor circuit, cause the processor circuit to:
identify input comprising one or more machine learning models that each include a graph of operators;
mine the one or more machine learning models based on one or more operational parameters to determine one or more fusion candidates, each of the one or more fusion candidates comprising a subgraph of at least one graph of operators, wherein each subgraph includes two or more operators;
extract a feature set from each of the one or more fusion candidates;
utilize a machine learning classifier to evaluate the one or more fusion candidates based on the feature sets extracted from each of the one or more fusion candidates; and
identify a proposed candidate of the one or more fusion candidates to fuse based on evaluation of the one or more fusion candidates.
16. The at least one non-transitory computer-readable medium of claim 15, comprising
instructions that, in response to being executed by the processor circuit cause the
processor circuit to utilize a performance counter monitor (PCM) to extract the
feature sets.

17. The at least one non-transitory computer-readable medium of claim 15, wherein each feature set includes indications of one or more of data movement patterns, computation patterns, system resource utilization, frequency, computation cost, and memory cost.
18. The at least one non-transitory computer-readable medium of claim 15, the machine learning classifier comprising a recurrent neural network (RNN).
19. The at least one non-transitory computer-readable medium of claim 18, comprising instructions that, in response to being executed by the processor circuit cause the processor circuit to map the feature sets to vectors corresponding to fusibility.
20. The at least one non-transitory computer-readable medium of claim 19, comprising instructions that, in response to being executed by the processor circuit cause the processor circuit to calculate a probability that each fusion candidate is fusible with the vectors corresponding to fusibility.
21. A computer-implemented method, comprising:
identifying input comprising one or more machine learning models that each include a graph of operators;
mining the one or more machine learning models based on one or more operational parameters to determine one or more fusion candidates, each of the one or more fusion candidates comprising a subgraph of at least one graph of operators, wherein each subgraph includes two or more operators;
extracting a feature set from each of the one or more fusion candidates;
utilizing a machine learning classifier to evaluate the one or more fusion candidates based on the feature sets extracted from each of the one or more fusion candidates; and
identifying a proposed candidate of the one or more fusion candidates to fuse based on evaluation of the one or more fusion candidates.

22. The computer-implemented method of claim 21, comprising combining each operator in the subgraph of the proposed candidate to fuse the proposed candidate into a fused candidate.
23. The computer-implemented method of claim 22, comprising evaluating computational efficiency of a first machine learning model with the proposed candidate and a second machine learning model with the fused candidate to validate the proposed candidate.
24. An apparatus, comprising:
means for identifying input comprising one or more machine learning models that each include a graph of operators;
means for mining the one or more machine learning models based on one or more operational parameters to determine one or more fusion candidates, each of the one or more fusion candidates comprising a subgraph of at least one graph of operators, wherein each subgraph includes two or more operators;
means for extracting a feature set from each of the one or more fusion candidates;
means for utilizing a machine learning classifier to evaluate the one or more fusion candidates based on the feature sets extracted from each of the one or more fusion candidates; and
means for identifying a proposed candidate of the one or more fusion candidates to fuse based on evaluation of the one or more fusion candidates.
25. The apparatus of claim 24, comprising means for utilizing weighted frequent
subgraph mining to mine the one or more machine learning models based on the one
or more operational parameters to determine the one or more fusion candidates.

Documents

Application Documents

# Name Date
1 202047053320-PROOF OF RIGHT [08-12-2020(online)].pdf 2020-12-08
2 202047053320-FORM 1 [08-12-2020(online)].pdf 2020-12-08
3 202047053320-DRAWINGS [08-12-2020(online)].pdf 2020-12-08
4 202047053320-DECLARATION OF INVENTORSHIP (FORM 5) [08-12-2020(online)].pdf 2020-12-08
5 202047053320-COMPLETE SPECIFICATION [08-12-2020(online)].pdf 2020-12-08
6 202047053320-FORM-26 [16-02-2021(online)].pdf 2021-02-16
7 202047053320-FORM 3 [07-06-2021(online)].pdf 2021-06-07
8 202047053320.pdf 2021-10-18
9 202047053320-abstract.jpg 2021-10-18
10 202047053320-FORM 3 [08-12-2021(online)].pdf 2021-12-08
11 202047053320-FORM 18 [13-04-2022(online)].pdf 2022-04-13
12 202047053320-FER.pdf 2022-08-24
13 202047053320-Information under section 8(2) [22-12-2022(online)].pdf 2022-12-22
14 202047053320-FORM 3 [22-12-2022(online)].pdf 2022-12-22
15 202047053320-Proof of Right [08-02-2023(online)].pdf 2023-02-08
16 202047053320-PETITION UNDER RULE 137 [24-02-2023(online)].pdf 2023-02-24
17 202047053320-OTHERS [24-02-2023(online)].pdf 2023-02-24
18 202047053320-FER_SER_REPLY [24-02-2023(online)].pdf 2023-02-24
19 202047053320-CLAIMS [24-02-2023(online)].pdf 2023-02-24
20 202047053320-FORM 3 [13-09-2023(online)].pdf 2023-09-13
21 202047053320-FORM 3 [13-03-2024(online)].pdf 2024-03-13
22 202047053320-US(14)-HearingNotice-(HearingDate-10-12-2024).pdf 2024-11-18
23 202047053320-Correspondence to notify the Controller [18-11-2024(online)].pdf 2024-11-18
24 202047053320-Information under section 8(2) [20-12-2024(online)].pdf 2024-12-20
25 202047053320-FORM 3 [20-12-2024(online)].pdf 2024-12-20
26 202047053320-Written submissions and relevant documents [24-12-2024(online)].pdf 2024-12-24
27 202047053320-Annexure [24-12-2024(online)].pdf 2024-12-24
28 202047053320-PatentCertificate30-01-2025.pdf 2025-01-30
29 202047053320-IntimationOfGrant30-01-2025.pdf 2025-01-30

Search Strategy

1 202047053320E_23-08-2022.pdf

ERegister / Renewals

3rd: 16 Apr 2025

From 28/01/2021 - To 28/01/2022

4th: 16 Apr 2025

From 28/01/2022 - To 28/01/2023

5th: 16 Apr 2025

From 28/01/2023 - To 28/01/2024

6th: 16 Apr 2025

From 28/01/2024 - To 28/01/2025

7th: 16 Apr 2025

From 28/01/2025 - To 28/01/2026