I have two hosts that are connected through RDMA (one is a SmartNIC, the other is the server). How can I know which pair of device/port/gid to use, if for example I want to run ib_send_bw -d <device> -i <port> -x <gid> between them?
For example, here is the output of show_gids for my two hosts:
Host A:
DEV PORT INDEX GID IPv4 VER DEV
--- ---- ----- --- ------------ --- ---
mlx5_0 1 0 fe80:0000:0000:0000:4ab0:2dff:fe8f:add0 v1 ens2f0np0
mlx5_0 1 1 fe80:0000:0000:0000:4ab0:2dff:fe8f:add0 v2 ens2f0np0
mlx5_0 1 2 0000:0000:0000:0000:0000:ffff:c0a8:0001 192.168.0.1 v1 ens2f0np0
mlx5_0 1 3 0000:0000:0000:0000:0000:ffff:c0a8:0001 192.168.0.1 v2 ens2f0np0
mlx5_1 1 0 fe80:0000:0000:0000:4ab0:2dff:fe8f:add1 v1 ens2f1np1
mlx5_1 1 1 fe80:0000:0000:0000:4ab0:2dff:fe8f:add1 v2 ens2f1np1
mlx5_1 1 2 0000:0000:0000:0000:0000:ffff:c0a8:0101 192.168.1.1 v1 ens2f1np1
mlx5_1 1 3 0000:0000:0000:0000:0000:ffff:c0a8:0101 192.168.1.1 v2 ens2f1np1
mlx5_2 1 0 fe80:0000:0000:0000:966d:aeff:fe9b:95e0 v1 ens4f0np0
mlx5_2 1 1 fe80:0000:0000:0000:966d:aeff:fe9b:95e0 v2 ens4f0np0
mlx5_3 1 0 fe80:0000:0000:0000:966d:aeff:fe9b:95e1 v1 ens4f1np1
mlx5_3 1 1 fe80:0000:0000:0000:966d:aeff:fe9b:95e1 v2 ens4f1np1
mlx5_4 1 0 fe80:0000:0000:0000:1270:fdff:fe86:5c7e v1 ens7f0np0
mlx5_4 1 1 fe80:0000:0000:0000:1270:fdff:fe86:5c7e v2 ens7f0np0
mlx5_5 1 0 fe80:0000:0000:0000:1270:fdff:fe86:5c7f v1 ens7f1np1
mlx5_5 1 1 fe80:0000:0000:0000:1270:fdff:fe86:5c7f v2 ens7f1np1
n_gids_found=16
Host B:
DEV PORT INDEX GID IPv4 VER DEV
--- ---- ----- --- ------------ --- ---
mlx5_2 1 0 fe80:0000:0000:0000:0040:abff:febb:0745 v2 enp3s0f0s0
mlx5_3 1 0 fe80:0000:0000:0000:0017:2eff:fef9:f1f6 v2 enp3s0f1s0
n_gids_found=2
I guess we may start with method of elimination first?
The rule of thumb is that if the interfaces are not pinging each other (provided they have IP correctly configured) they are unlikely going to talk to each other with RDMA.
the first question is, which interfaces can reach the remote? For instance, you may eliminate those entries, which belong to PFs not actually connected to the desired network (i.e. do not have cables inserted, or just are down, or connected to different switch).
Since on Host B you have 2 GIDs, both version v2, you may easily drop all versions v1 on Host A.
you can notice, that GIDs starting with fe80 are reffering to link-local ipv6 addresses, and the others are tied to specific ip address set on the interface. the link-local GIDs are ok to use as long as you are connecting within logical local network.