Will the 16n prefetch in DDR5 affect the bandwidth of small-granularity memory accesses?

26 Views Asked by At

In the design of DDR5, the prefetch length has increased from 8n to 16n. In a traditional memory interface with an 8-byte bit width, this means that internally in the memory chip, 16 * 8 = 128 bytes of contiguous data need to be read at once, exceeding the size of a 64-byte cache line. Therefore, I suspect that if the amount of data accessed each time is less than or equal to 64 bytes and the access is random, half of the prefetch data will fail, resulting in the effective bandwidth of the memory module being halved (at least reduced).

So, I conducted experiments targeting a continuous address space of 16GB, using the point-chase method for random access. In Experiment 1, I accessed the first 64 bytes in 128 bytes units; in Experiment 2, I accessed the first 128 bytes 256 bytes units. The accessed data in the two experiments were aligned to 128 bytes and 256 bytes respectively. The experiments were conducted with 1GB huge pages enabled, thus excluding the influence of the TLB. Additionally, the footprint sizes were the same for both experiments, eliminating the influence of the cache system. The results showed that the system perceived the same memory bandwidth in both experiments.

I'm using an AMD 9654 CPU and the memory module M321RYGA0BB0-CQKZJ (DDR5-4800MHz).

Is my understanding of the prefetch behavior in DDR5 incorrect?

0

There are 0 best solutions below