Abstract: One embodiment provides a graphics processor including a processing resource including a register file, memory, a cache memory, and load/store/cache circuitry to process load, store, and prefetch messages from the processing resource. The circuitry includes support for an immediate address offset that will be used to adjust the address supplied for a memory access to be requested by the circuitry. Including support for the immediate address offset removes the need to execute additional instructions to adjust the address to be accessed prior to execution of the memory access instruction.
Description:[0001] The present application claims priority to U.S. Non-Provisional Patent Application No. 17/480,528 filed on 21 September 2021 and titled “IMMEDIATE OFFSET OF LOAD STORE AND ATOMIC INSTRUCTIONS” the entire disclosure of which is hereby incorporated by reference.
FIELD
[0002] This disclosure relates generally to data processing and more particularly to data processing via a general-purpose graphics processing unit.
BACKGROUND OF THE DISCLOSURE
[0003] The operands for arithmetic and logic operations performed by a graphics processor are contained in registers. To operate on data in main memory, the data is first copied into registers. A load operation copies data from main memory into a register. A store operation copies data from a register into main memory. Modern 3D game applications can make use of structured buffers to store information used to render a scene. For example, structured buffers can be used to store lighting, material, and related information for compute shaders, or vertex attributes for vertex shaders. The use of a global offset to access structured buffer members has become common in 3D game applications. To perform a load operation to store structured buffer member data into a register, an add operation is performed to add the global offset to the structure address.
, C , C , Claims:1. A graphics processor comprising:
a processing resource including a register file;
a memory device;
a cache coupled with the processing resources and the memory; and
circuitry to process memory access messages received from the processing resource, wherein to process the memory access messages, the circuitry is configured to:
receive a memory access message from the processing resource, wherein the memory access message includes an address and an immediate offset value;
perform a bounds check for the memory address as adjusted according to the immediate offset value, wherein the bounds check is performed for a memory allocation to be accessed via the memory access message;
generate one or more memory access requests including the memory address as adjusted according to the immediate offset value in response to determination that the memory access is an in-bound memory access; and
submit the one or more memory access requests to a cache or memory interface.
2. The graphics processor as in claim 1, wherein the processing resource is configured to:
receive a memory access instruction that includes the address and the immediate offset value; and
transmit the memory access message to the circuitry, the memory access message including the address and the immediate offset value.
3. The graphics processor as in claim 1 or 2, wherein to generate the one or more memory access requests including the memory address as adjusted according to the immediate offset value includes to:
determine a set of active parallel processing lanes of the processing resource that are associated with the memory access message;
compute a per-lane offset for a parallel processing lane in the set of active parallel processing lanes, the per-lane offset to indicate a data element in a set of packed data elements; and
add the immediate offset value to the per-lane offset to generate an adjusted per-lane offset.
4. The graphics processor as in claim 3, wherein the circuitry to process the memory access messages includes an adder circuit associated with each of multiple parallel processing lanes of the processing resource and circuitry to store the immediate offset value.
5. The graphics processor as in claim 4, wherein to perform the bounds check for the memory address includes to perform a bounds check for the adjusted per-lane offset.
6. The graphics processor as in any one of claims 1-5, wherein the memory access message indicates to transfer data between the register file and the memory device or between the memory device and the cache memory.
7. The graphics processor as in any one of claims 1-6, wherein the circuitry is configured to decode the memory access message received from the processing resource to determine a memory access operation to perform in response to the memory access message.
8. The graphics processor as in claim 7, wherein the memory access operation is a load operation to transfer data from the memory device to the register file, a store operation to transfer data from the register file to the memory device, or an atomic operation to perform an atomic read-modify-write operation on data on the memory device.
9. The graphics processor as in any one of claims 1-8, wherein the memory allocation to be accessed via the memory access message is a surface including pixel data associated with a graphics operation performed by the processing resource or a surface including general-purpose compute data associated with a compute operation performed by the processing resource.
10. The graphics processor as in claim 9, wherein the general-purpose compute data includes matrix data associated with a matrix operation performed by the processing resource and the processing resource includes matrix operation acceleration circuitry to perform a matrix operation on the general-purpose compute data.
11. A method comprising:
receiving a memory access message at circuitry configured to facilitate access to memory of a graphics processing device, the message received from a processing resource of the graphics processing device, wherein the memory access message includes an address and an immediate offset value;
performing a bounds check for the memory address as adjusted according to the immediate offset value, wherein the bounds check is performed for a memory allocation to be accessed via the memory access message;
generating one or more memory access requests including the memory address as adjusted according to the immediate offset value in response to determination that the memory access is an in-bound memory access; and
submitting the one or more memory access requests to a cache or memory interface.
12. The method as in claim 11, further comprising:
receiving a memory access instruction that includes the address and the immediate offset value; and
transmitting the memory access message to the circuitry, the memory access message including the address and the immediate offset value.
13. The method as in claim 11 or 12, wherein generating the one or more memory access requests including the memory address as adjusted according to the immediate offset value includes:
determining a set of active parallel processing lanes of the processing resource that are associated with the memory access message;
computing a per-lane offset for a parallel processing lane in the set of active parallel processing lanes, the per-lane offset to indicate a data element in a set of packed data elements;
adding the immediate offset value to the per-lane offset to generate an adjusted per-lane offset; and
performing the bounds check on the adjusted per-lane offset.
14. The method as in claim 13, wherein the circuitry to process the memory access messages includes an adder circuit associated with each of multiple parallel processing lanes of the processing resource and circuitry to store the immediate offset value.
15. A data processing system comprising means to perform a method as in any one of claims 11-14.
| # | Name | Date |
|---|---|---|
| 1 | 202244047474-FORM 1 [20-08-2022(online)].pdf | 2022-08-20 |
| 2 | 202244047474-DRAWINGS [20-08-2022(online)].pdf | 2022-08-20 |
| 3 | 202244047474-DECLARATION OF INVENTORSHIP (FORM 5) [20-08-2022(online)].pdf | 2022-08-20 |
| 4 | 202244047474-COMPLETE SPECIFICATION [20-08-2022(online)].pdf | 2022-08-20 |
| 5 | 202244047474-FORM-26 [18-11-2022(online)].pdf | 2022-11-18 |
| 6 | 202244047474-FORM 3 [20-02-2023(online)].pdf | 2023-02-20 |
| 7 | 202244047474-Proof of Right [05-09-2023(online)].pdf | 2023-09-05 |
| 8 | 202244047474-FORM 18 [15-09-2025(online)].pdf | 2025-09-15 |