How to know which RDMA device/port/gid to use?

50 Views Asked by At

I have two hosts that are connected through RDMA (one is a SmartNIC, the other is the server). How can I know which pair of device/port/gid to use, if for example I want to run ib_send_bw -d <device> -i <port> -x <gid> between them?

For example, here is the output of show_gids for my two hosts:

Host A:

DEV PORT    INDEX   GID                 IPv4        VER DEV
--- ----    -----   ---                 ------------    --- ---
mlx5_0  1   0   fe80:0000:0000:0000:4ab0:2dff:fe8f:add0         v1  ens2f0np0
mlx5_0  1   1   fe80:0000:0000:0000:4ab0:2dff:fe8f:add0         v2  ens2f0np0
mlx5_0  1   2   0000:0000:0000:0000:0000:ffff:c0a8:0001 192.168.0.1     v1  ens2f0np0
mlx5_0  1   3   0000:0000:0000:0000:0000:ffff:c0a8:0001 192.168.0.1     v2  ens2f0np0
mlx5_1  1   0   fe80:0000:0000:0000:4ab0:2dff:fe8f:add1         v1  ens2f1np1
mlx5_1  1   1   fe80:0000:0000:0000:4ab0:2dff:fe8f:add1         v2  ens2f1np1
mlx5_1  1   2   0000:0000:0000:0000:0000:ffff:c0a8:0101 192.168.1.1     v1  ens2f1np1
mlx5_1  1   3   0000:0000:0000:0000:0000:ffff:c0a8:0101 192.168.1.1     v2  ens2f1np1
mlx5_2  1   0   fe80:0000:0000:0000:966d:aeff:fe9b:95e0         v1  ens4f0np0
mlx5_2  1   1   fe80:0000:0000:0000:966d:aeff:fe9b:95e0         v2  ens4f0np0
mlx5_3  1   0   fe80:0000:0000:0000:966d:aeff:fe9b:95e1         v1  ens4f1np1
mlx5_3  1   1   fe80:0000:0000:0000:966d:aeff:fe9b:95e1         v2  ens4f1np1
mlx5_4  1   0   fe80:0000:0000:0000:1270:fdff:fe86:5c7e         v1  ens7f0np0
mlx5_4  1   1   fe80:0000:0000:0000:1270:fdff:fe86:5c7e         v2  ens7f0np0
mlx5_5  1   0   fe80:0000:0000:0000:1270:fdff:fe86:5c7f         v1  ens7f1np1
mlx5_5  1   1   fe80:0000:0000:0000:1270:fdff:fe86:5c7f         v2  ens7f1np1
n_gids_found=16

Host B:

DEV     PORT    INDEX   GID                                     IPv4            VER     DEV
---     ----    -----   ---                                     ------------    ---     ---
mlx5_2  1       0       fe80:0000:0000:0000:0040:abff:febb:0745                 v2      enp3s0f0s0
mlx5_3  1       0       fe80:0000:0000:0000:0017:2eff:fef9:f1f6                 v2      enp3s0f1s0
n_gids_found=2
1

There are 1 best solutions below

0
user3621602 On

I guess we may start with method of elimination first?

The rule of thumb is that if the interfaces are not pinging each other (provided they have IP correctly configured) they are unlikely going to talk to each other with RDMA.

the first question is, which interfaces can reach the remote? For instance, you may eliminate those entries, which belong to PFs not actually connected to the desired network (i.e. do not have cables inserted, or just are down, or connected to different switch).

Since on Host B you have 2 GIDs, both version v2, you may easily drop all versions v1 on Host A.

you can notice, that GIDs starting with fe80 are reffering to link-local ipv6 addresses, and the others are tied to specific ip address set on the interface. the link-local GIDs are ok to use as long as you are connecting within logical local network.