Abstract: Embodiments of systems methods and apparatuses for heterogeneous computing are described. In some embodiments a hardware heterogeneous scheduler dispatches instructions for execution on one or more plurality of heterogeneous processing elements the instructions corresponding to a code fragment to be processed by the one or more of the plurality of heterogeneous processing elements wherein the instructions are native instructions to at least one of the one or more of the plurality of heterogeneous processing elements.
1. An system comprising:
a plurality of heterogeneous processing elements;
a hardware heterogeneous scheduler to dispatch instructions for execution on one or more of the plurality of heterogeneous processing elements, the instructions corresponding to a code fragment to be processed by the one or more of the plurality of heterogeneous processing elements, wherein the instructions are native instructions to at least one of the one or more of the plurality of heterogeneous processing elements.
2. The system of claim 1, wherein the plurality of heterogeneous processing elements comprises an in-order processor core, an out-of-order processor core, and a packed data processor core.
3. The system of claim 2, wherein the plurality of heterogeneous processing elements further comprises an accelerator.
4. The system of any of claims 1-3, wherein the hardware heterogeneous scheduler further comprising:
a program phase detector to detect a program phase of the code fragment;
wherein the plurality of heterogeneous processing elements includes a first processing element having a first microarchitecture and a second processing element having a second microarchitecture different from the first microarchitecture;
wherein the program phase is one of a plurality of program phases, including a first phase and a second phase and the dispatch of instructions is based on part on the detected program phase; and
wherein processing of the code fragment by the first processing element is to produce improved performance per watt characteristics as compared to processing of the code fragment by the second processing element.
5. The system of any of claims 1-4, wherein the hardware heterogeneous scheduler
further comprises:
a selector to select a type of processing element of the plurality of processing elements to execute the received code fragment and schedule the code fragment on a processing element of the selected type of processing elements via dispatch.
The system of claim 1, wherein the code fragment is one or more instructions
associated with a software thread.
The system of any of claims 5-6, wherein for a data parallel program phase the
selected type of processing element is a processing core to execute single
instruction, multiple data (SIMD) instructions.
The system of any of claims 5-7, wherein for a data parallel program phase the
selected type of processing element is circuitry to support dense arithmetic
primitives.
The system of any of claims 5-7, wherein for a data parallel program phase the
selected type of processing element is an accelerator.
The system of any of claims 5-9, wherein a data parallel program phase comprises
data elements that are processed simultaneously using a same control flow.
The system of any of claims 5-10, wherein for a thread parallel program phase the
selected type of processing element is a scalar processing core.
The system of any of claims 5-11, wherein a thread parallel program phase
comprises data dependent branches that use unique control flows.
The system of any of claims 2-12, wherein for a serial program phase the selected
type of processing element is an out-of-order core.
The system of any of claims 2-13, wherein for a data parallel program phase the
selected type of processing element is a processing core to execute single
instruction, multiple data (SIMD) instructions.
The system of any of claims 1-14, wherein the hardware heterogeneous scheduler is
to support multiple code types including compiled, intrinsics, assembly, libraries,
intermediate, offload, and device.
The system of any of claims 5-15, wherein the hardware heterogeneous scheduler is
to emulate functionality when the selected type of processing element cannot
natively handle the code fragment.
The system of any of claims 1-15, wherein the hardware heterogeneous scheduler is
to emulate functionality when a number of hardware threads available is
oversubscribed.
The system of any of claims 5-15, wherein the hardware heterogeneous scheduler is to emulate functionality when the selected type of processing element cannot natively handle the code fragment.
The system of any of claims 5-18, wherein the selection of a type of processing element of the plurality of heterogeneous processing elements is transparent to a user.
The system of any of claims 5-19, wherein the selection of a type of processing element of the plurality of heterogeneous processing elements is transparent to an operating system.
The system of any of claims 1-20, wherein the hardware heterogeneous scheduler is to present a homogeneous multiprocessor programming model to make each thread appear to a programmer as if it is executing on a scalar core. The system of claim 21, wherein the presented homogeneous multiprocessor programming model is to present an appearance of support for a full instruction set. The system of any of claims 1-22, wherein the plurality of heterogeneous processing elements is to share a memory address space.
The system of any of claims 1-23, wherein the hardware heterogeneous scheduler includes a binary translator that is to be executed on one of the heterogeneous processing elements.
The system of any of claims 5-24, wherein a default selection of a type of processing element of the plurality of heterogeneous processing elements is a latency optimized core.
The system of any of claims 1-25, wherein the heterogeneous hardware scheduler to select a protocol to use on a multi-protocol interface for the dispatched instructions. The system of any of claim 26, wherein a first protocol supported by the multi¬protocol bus interface comprises a memory interface protocol to be used to access a system memory address space.
The system of any of claims 26-27, wherein a second protocol supported by the multi-protocol bus interface comprises a cache coherency protocol to maintain coherency between data stored in a local memory of the accelerator and a memory
subsystem of a host processor including a host cache hierarchy and a system memory.
The system of any of claims 26-28, wherein a third protocol supported by the multi-protocol bus interface comprises a serial link protocol supporting device discovery, register access, configuration, initialization, interrupts, direct memory access, and address translation services.
The system of claim 29, wherein the third protocol comprises the Peripheral Component Interface Express (PCIe) protocol.
| # | Name | Date |
|---|---|---|
| 1 | 201947023217-FORM 1 [12-06-2019(online)].pdf | 2019-06-12 |
| 2 | 201947023217-DRAWINGS [12-06-2019(online)].pdf | 2019-06-12 |
| 3 | 201947023217-DECLARATION OF INVENTORSHIP (FORM 5) [12-06-2019(online)].pdf | 2019-06-12 |
| 4 | 201947023217-COMPLETE SPECIFICATION [12-06-2019(online)].pdf | 2019-06-12 |
| 5 | 201947023217.pdf | 2019-06-13 |
| 6 | Correspondence by Agent _Form-5_14-06-2019.pdf | 2019-06-14 |
| 7 | 201947023217-FORM-26 [26-06-2019(online)].pdf | 2019-06-26 |
| 8 | Correspondence by Agent_Power of Attorney_01-07-2019.pdf | 2019-07-01 |
| 9 | 201947023217-FORM 3 [09-12-2019(online)].pdf | 2019-12-09 |
| 10 | 201947023217-FORM 18 [19-11-2020(online)].pdf | 2020-11-19 |
| 11 | 201947023217-FER.pdf | 2021-12-14 |
| 12 | 201947023217-OTHERS [14-06-2022(online)].pdf | 2022-06-14 |
| 13 | 201947023217-FORM 3 [14-06-2022(online)].pdf | 2022-06-14 |
| 14 | 201947023217-FER_SER_REPLY [14-06-2022(online)].pdf | 2022-06-14 |
| 15 | 201947023217-CLAIMS [14-06-2022(online)].pdf | 2022-06-14 |
| 16 | 201947023217-Proof of Right [29-08-2023(online)].pdf | 2023-08-29 |
| 17 | 201947023217-US(14)-HearingNotice-(HearingDate-14-05-2024).pdf | 2024-04-22 |
| 18 | 201947023217-Correspondence to notify the Controller [25-04-2024(online)].pdf | 2024-04-25 |
| 19 | 201947023217-FORM-26 [13-05-2024(online)].pdf | 2024-05-13 |
| 20 | 201947023217-Written submissions and relevant documents [29-05-2024(online)].pdf | 2024-05-29 |
| 21 | 201947023217-PETITION UNDER RULE 137 [29-05-2024(online)].pdf | 2024-05-29 |
| 22 | 201947023217-MARKED COPIES OF AMENDEMENTS [29-05-2024(online)].pdf | 2024-05-29 |
| 23 | 201947023217-FORM 3 [29-05-2024(online)].pdf | 2024-05-29 |
| 24 | 201947023217-FORM 13 [29-05-2024(online)].pdf | 2024-05-29 |
| 25 | 201947023217-Annexure [29-05-2024(online)].pdf | 2024-05-29 |
| 26 | 201947023217-AMMENDED DOCUMENTS [29-05-2024(online)].pdf | 2024-05-29 |
| 27 | 201947023217-PatentCertificate20-06-2024.pdf | 2024-06-20 |
| 28 | 201947023217-IntimationOfGrant20-06-2024.pdf | 2024-06-20 |
| 1 | SearchE_20-10-2021.pdf |