Workload Memory Bandwidth Comparison Inconsistency

105 Views Asked by At

I have an Intel(R) Core(TM) i7-4720HQ CPU @ 2.60GHz (Haswell) processor. In a relatively idle situation, I ran the following Perf commands for around 5 seconds. The counters are offcore_response.all_data_rd.l3_miss.local_dram and offcore_response.all_code_rd.l3_miss.local_dram:

sudo perf stat -e offcore_response.all_data_rd.l3_miss.local_dram,offcore_response.all_code_rd.l3_miss.local_dram -p <PID>

The workloads are: 1) playing a video in VLC and 2) running KDevelop indexer on a large code base. The outputs are shown, below:

VLC:

    Performance counter stats for process id '14617':

         1,621,980      offcore_response.all_data_rd.l3_miss.local_dram                                   
         1,611,825      offcore_response.all_code_rd.l3_miss.local_dram                                   

       4.993841802 seconds time elapsed

KDevelop:

Performance counter stats for process id '23294':

        31,006,390      offcore_response.all_data_rd.l3_miss.local_dram                                   
        10,236,222      offcore_response.all_code_rd.l3_miss.local_dram                                   

       5.095681532 seconds time elapsed

Based on these statistics, the memory access frequency in KDevelop is more than 12 times as much as VLC.

But the IMC counters statistics (retrieved using PCM) are at odds with the above-mentioned performance counters. In the idle system, the total system bandwidth is around 2.65GB (READ: 2.30GB, WRITE: 0.35GB). The total system bandwidth for each workload (ran separately) is as follows:

VLC:

around `8.40`GB (READ:`4.65`GB, WRITE:`3.75`GB)

KDevelop:

around `3.75`GB (READ:`3.15`GB, WRITE:`0.60`GB)

After reducing the idle system bandwidth, the VLC and KDevelop bandwidths will be around 5.75GB and 1.10GB, respectively. This time, the VLC memory access frequency is more than 5 times as much as KDevelop, which shows an obvious conflict.

How can these two outcomes be described?

0

There are 0 best solutions below