What does node distances in numactl mean?

3.9k Views Asked by At

I'm trying to understand what node distances in numactl --hardware mean?

On our cluster, it outputs the following

numactl --hardware available: 2 nodes (0-1) node 0 cpus: 0 1 2 3 4 5 12 13 14 15 16 17 node 0 size: 32143 MB node 0 free: 188 MB node 1 cpus: 6 7 8 9 10 11 18 19 20 21 22 23 node 1 size: 32254 MB node 1 free: 69 MB node distances: node 0 1 0: 10 21 1: 21 10 This is what I understood so far:

  • we have 24 virtual CPUs and that each node has 32Gb of DRAM.
  • On a numa cluster, we will have to make a "hop" to the next cluster to access the memory on other node and this incurs a higher latency.
  • In this context, do the numbers 10 and 21 indicate the latencies for "hops"? How do I find the latency in ns? is that specified somewhere?

This and this didn't help me much.

EDIT: This link says that distances are not in ns, but are relative distances. how do I get the absolute latency in ns?

Any help will be appreciated.

2

There are 2 best solutions below

0
On

To get absolute latency numbers, if you're on an Intel system you can use their Memory Latency Checker tool for any specific system. https://software.intel.com/en-us/articles/intel-memory-latency-checker

It prefers to use root/admin powers to disable the hardware prefetching which otherwise skews the numbers, but if you don't have that, the docs also point out that you can ask it to get random elements from the other nodes to get very close to the true numbers e.g.:

./mlc --latency_matrix -e -l128 -r
Intel(R) Memory Latency Checker - v3.5
Command line parameters: --latency_matrix -e -l128 -r

Using buffer size of 200.000MB
Measuring idle latencies (in ns)...
                Numa node
Numa node            0       1
       0         112.5   180.3
       1         180.8   112.4
0
On

numactl --hardware gives you stats about the architecture of your hardware, not about on its performance.

If you want the performance characteristics of your hardware you will have to measure it yourself, either by finding an existing one online or writing your own benchmark. https://stackoverflow.com/a/47815885/1411628 will give you an idea on how to get started on writing your own bench.