Abstract: Systems, methods, and apparatuses relating to circuitry to implement toggle point insertion for a clustered decode pipeline are described. In one example, a hardware processor core includes a first decode cluster comprising a plurality of decoder circuits, a second decode cluster comprising a plurality of decoder circuits, and a toggle point control circuit to toggle between sending instructions requested for decoding between the first decode cluster and the second decode cluster, wherein the toggle point control circuit is to: determine a location in an instruction stream as a candidate toggle point to switch the sending of the instructions requested for decoding between the first decode cluster and the second decode cluster, track a number of times a characteristic of multiple previous decodes of the instruction stream is present for the location, and cause insertion of a toggle point at the location, based on the number of times, to switch the sending of the instructions requested for decoding between the first decode cluster and the second decode cluster.
Description:RELATED APPLICATION
[0001] The present application claims priority to U.S. Non-Provisional Patent Application No. 17/484,969 filed on 24 September 2021 and titled “SCALABLE TOGGLE POINT CONTROL CIRCUITRY FOR A CLUSTERED DECODE PIPELINE” the entire disclosure of which is hereby incorporated by reference.
TECHNICAL FIELD
[0002] The disclosure relates generally to electronics, and, more specifically, an example of the disclosure relates to circuitry to implement toggle point insertion for a clustered decode pipeline.
BACKGROUND
[0003] A processor, or set of processors, executes instructions from an instruction set, e.g., the instruction set architecture (ISA). The instruction set is the part of the computer architecture related to programming, and generally includes the native data types, instructions, register architecture, addressing modes, memory architecture, interrupt and exception handling, and external input and output (I/O). It should be noted that the term instruction herein may refer to a macro-instruction, e.g., an instruction that is provided to the processor for execution, or to a micro-instruction, e.g., an instruction that results from a processor’s decoder decoding macro-instructions.
BRIEF DESCRIPTION OF THE DRAWINGS
[0004] The present disclosure is illustrated by way of example and not limitation in the figures of the accompanying drawings, in which like references indicate similar elements and in which:
[0005] Figure 1 illustrates a processor core having a plurality of decode clusters and a toggle point control circuit according to examples of the disclosure.
[0006] Figure 2 illustrates an example clustered decode program flow according to examples of the disclosure.
[0007] Figure 3 illustrates an example format of a toggle point tracking data structure according to examples of the disclosure.
[0008] Figure 4 illustrates a flow diagram for dynamic load balancing according to examples of the disclosure.
[0009] Figure 5 illustrates a flow diagram for an invalidation holding finite state machine according to examples of the disclosure.
[0010] Figure 6 is a flow diagram illustrating operations for inserting a toggle point to switch the decoding of an instruction stream between a plurality of decode clusters according to examples of the disclosure.
[0011] Figure 7A is a block diagram illustrating both an exemplary in-order pipeline and an exemplary register renaming, out-of-order issue/execution pipeline according to examples of the disclosure.
[0012] Figure 7B is a block diagram illustrating both an exemplary example of an in-order architecture core and an exemplary register renaming, out-of-order issue/execution architecture core to be included in a processor according to examples of the disclosure.
[0013] Figure 8A is a block diagram of a single processor core, along with its connection to the on-die interconnect network and with its local subset of the Level 2 (L2) cache, according to examples of the disclosure.
[0014] Figure 8B is an expanded view of part of the processor core in Figure 8A according to examples of the disclosure.
[0015] Figure 9 is a block diagram of a processor that may have more than one core, may have an integrated memory controller, and may have integrated graphics according to examples of the disclosure.
[0016] Figure 10 is a block diagram of a system in accordance with one example of the present disclosure.
[0017] Figure 11 is a block diagram of a more specific exemplary system in accordance with an example of the present disclosure.
[0018] Figure 12, shown is a block diagram of a second more specific exemplary system in accordance with an example of the present disclosure.
, Claims:1. A hardware processor core comprising:
a first decode cluster comprising a plurality of decoder circuits;
a second decode cluster comprising a plurality of decoder circuits; and
a toggle point control circuit to toggle between sending instructions requested
for decoding between the first decode cluster and the second decode cluster, wherein the
toggle point control circuit is to:
determine a location in an instruction stream as a candidate toggle point to switch
the sending of the instructions requested for decoding between the first decode cluster and the second decode cluster, track a number of times a characteristic of multiple previous decodes of the instruction stream is present for the location, and cause insertion of a toggle point at the location, based on the number of times, to switch the sending of the instructions requested for decoding between the first decode cluster and the second decode cluster.
2. The hardware processor core of claim 1, wherein the characteristic is a number of micro-operations decoded from the instruction stream before the location and after an immediately previous switch of decoding of the instruction stream between the first decode cluster and the second decode cluster.
3. The hardware processor core of claim 1, wherein the characteristic is a number of macro-instructions decoded from the instruction stream before the location and after an immediately previous switch of decoding of the instruction stream between the first decode cluster and the second decode cluster.
4. The hardware processor core of claim 1, wherein the insertion of the toggle point comprises insertion of a branch instruction in a branch target buffer of the hardware processor core.
5. The hardware processor core of claim 1, wherein the toggle point control circuit is to further remove the location as the candidate toggle point when an existing toggle point is encountered within a threshold number of instructions after the location in a subsequent decode of the instruction stream.
6. The hardware processor core of claim 1, wherein the characteristic is a number of micro-operations decoded from the instruction stream before the location.
7. The hardware processor core of claim 1, wherein the toggle point control circuit comprises a timer and is to stop tracking the number of times the characteristic of multiple previous decodes of the instruction stream is present for the location after a tracking time from the timer exceeds a threshold time.
8. The hardware processor core of any one of claims 1-7, wherein the toggle point control circuit is to determine a plurality of candidate toggle points and track a corresponding number of times a respective characteristic of multiple previous decodes of the instruction stream is present for each respective location.
9. A method comprising:
receiving an instruction stream requested for decode by a hardware processor core comprising a first decode cluster having a plurality of decoder circuits and a second decode cluster having a plurality of decoder circuits;
determining, by a toggle point control circuit of the hardware processor core, a location in the instruction stream as a candidate toggle point to switch sending of the instructions requested for decoding between the first decode cluster and the second decode cluster;
tracking, by the toggle point control circuit, a number of times a characteristic of multiple
previous decodes of the instruction stream is present for the location; and
inserting a toggle point at the location, based on the number of times, to switch the sending
of the instructions requested for decoding between the first decode cluster and the second
decode cluster.
10. The method of claim 9, wherein the characteristic is a number of micro-operations decoded from the instruction stream before the location and after an immediately previous switch of decoding of the instruction stream between the first
decode cluster and the second decode cluster.
11. The method of claim 9, wherein the characteristic is a number of macro-instructions decoded from the instruction stream before the location and after an immediately previous switch of decoding of the instruction stream between the first decode cluster and the second decode cluster.
12. The method of claim 9, wherein the insertion of the toggle point comprises inserting a branch instruction in a branch target buffer of the hardware processor core.
13. The method of claim 9, further comprising removing the location as the candidate toggle point when an existing toggle point is encountered within a threshold number of instructions after the location in a subsequent decode of the instruction stream.
14. The method of claim 9, wherein the characteristic is a number of micro-operations decoded from the instruction stream before the location.
15. The method of claim 9, further comprising stopping the tracking of the number of times the characteristic of multiple previous decodes of the instruction stream is present for the location after a tracking time exceeds a threshold time.
16. The method of any one of claims 9-15, wherein the determining comprises determining a plurality of candidate toggle points and the tracking comprises tracking a corresponding number of times a respective characteristic of multiple previous decodes of the instruction stream is present for each respective location.
17. An apparatus comprising:
a memory to store instructions;
a first decode cluster comprising a plurality of decoder circuits;
a second decode cluster comprising a plurality of decoder circuits; and
a toggle point control circuit to toggle between sending the instructions requested for
decoding between the first decode cluster and the second decode cluster, wherein the
toggle point control circuit is to:
determine a location in an instruction stream as a candidate toggle point to switch the sending of the instructions requested for decoding between the first decode cluster and the second decode cluster, track a number of times a characteristic of multiple previous decodes of the instruction stream is present for the location, and
cause insertion of a toggle point at the location, based on the number of times, to switch the sending of the instructions requested for decoding between the first decode cluster and the second decode cluster.
18. The apparatus of claim 17, wherein the characteristic is a number of micro- operations decoded from the instruction stream before the location and after an immediately previous switch of decoding of the instruction stream between the first decode cluster and the second decode cluster.
19. The apparatus of claim 17, wherein the characteristic is a number of macro- instructions decoded from the instruction stream before the location and after an immediately previous switch of decoding of the instruction stream between the first decode cluster and the second decode cluster.
20. The apparatus of claim 17, wherein the insertion of the toggle point comprises insertion of a branch instruction in a branch target buffer.
21. The apparatus of claim 17, wherein the toggle point control circuit is to further remove the location as the candidate toggle point when an existing toggle point is encountered within a threshold number of instructions after the location in a subsequent decode of the instruction stream.
22. The apparatus of claim 17, wherein the characteristic is a number of micro-operations decoded from the instruction stream before the location.
23. The apparatus of claim 17, wherein the toggle point control circuit comprises a timer and is to stop tracking the number of times the characteristic of multiple previous decodes of the instruction stream is present for the location after a tracking time from the timer exceeds a threshold time.
24. The apparatus of any one of claims 17-23, wherein the toggle point control circuit is to determine a plurality of candidate toggle points and track a corresponding number of times a respective characteristic of multiple previous decodes of the instruction stream is present for each respective location.
| # | Name | Date |
|---|---|---|
| 1 | 202244048156-FORM 1 [24-08-2022(online)].pdf | 2022-08-24 |
| 2 | 202244048156-DRAWINGS [24-08-2022(online)].pdf | 2022-08-24 |
| 3 | 202244048156-DECLARATION OF INVENTORSHIP (FORM 5) [24-08-2022(online)].pdf | 2022-08-24 |
| 4 | 202244048156-COMPLETE SPECIFICATION [24-08-2022(online)].pdf | 2022-08-24 |
| 5 | 202244048156-FORM-26 [24-11-2022(online)].pdf | 2022-11-24 |
| 6 | 202244048156-FORM 3 [24-02-2023(online)].pdf | 2023-02-24 |
| 7 | 202244048156-FORM 3 [24-08-2023(online)].pdf | 2023-08-24 |
| 8 | 202244048156-Proof of Right [10-10-2023(online)].pdf | 2023-10-10 |
| 9 | 202244048156-FORM 3 [23-02-2024(online)].pdf | 2024-02-23 |
| 10 | 202244048156-FORM 18 [17-09-2025(online)].pdf | 2025-09-17 |