I am able to run OpenMPI job in multiple nodes under ssh
. Everything looks good but I find that I do not know much about what is really happening. So, how nodes communicate in OpenMPI? It's in multiple nodes, hence it can not be shared memory. It also seems not TCP or UDP because I have not configured any port. Can anyone describe what happened when a message is sent and received between 2 processes in 2 nodes? THANKS!
How nodes communicate in OpenMPI
2.8k Views Asked by Fallin At
1
There are 1 best solutions below
Related Questions in MPI
- Create Arabic msi (or any other locale, that is not listed in visual studio) using visual studio
- Read Platform information from .msi
- Patch building with MsiMsp.exe -- can target MSI differ from original MSI?
- MSI creation: Terminate application before upgrading
- Setup project always clean installation
- Use Orca to remove assembly from installer
- MSI Installer: Adding a non default action to a file extension
- Migrating PowerBuilder Application Build Process to TeamCity
- Stop windows service before perform the uninstallation
- How to execute spesific method from MSI dialog
Related Questions in OPENMPI
- Create Arabic msi (or any other locale, that is not listed in visual studio) using visual studio
- Read Platform information from .msi
- Patch building with MsiMsp.exe -- can target MSI differ from original MSI?
- MSI creation: Terminate application before upgrading
- Setup project always clean installation
- Use Orca to remove assembly from installer
- MSI Installer: Adding a non default action to a file extension
- Migrating PowerBuilder Application Build Process to TeamCity
- Stop windows service before perform the uninstallation
- How to execute spesific method from MSI dialog
Related Questions in MPI4PY
- Create Arabic msi (or any other locale, that is not listed in visual studio) using visual studio
- Read Platform information from .msi
- Patch building with MsiMsp.exe -- can target MSI differ from original MSI?
- MSI creation: Terminate application before upgrading
- Setup project always clean installation
- Use Orca to remove assembly from installer
- MSI Installer: Adding a non default action to a file extension
- Migrating PowerBuilder Application Build Process to TeamCity
- Stop windows service before perform the uninstallation
- How to execute spesific method from MSI dialog
Trending Questions
- UIImageView Frame Doesn't Reflect Constraints
- Is it possible to use adb commands to click on a view by finding its ID?
- How to create a new web character symbol recognizable by html/javascript?
- Why isn't my CSS3 animation smooth in Google Chrome (but very smooth on other browsers)?
- Heap Gives Page Fault
- Connect ffmpeg to Visual Studio 2008
- Both Object- and ValueAnimator jumps when Duration is set above API LvL 24
- How to avoid default initialization of objects in std::vector?
- second argument of the command line arguments in a format other than char** argv or char* argv[]
- How to improve efficiency of algorithm which generates next lexicographic permutation?
- Navigating to the another actvity app getting crash in android
- How to read the particular message format in android and store in sqlite database?
- Resetting inventory status after order is cancelled
- Efficiently compute powers of X in SSE/AVX
- Insert into an external database using ajax and php : POST 500 (Internal Server Error)
Popular # Hahtags
Popular Questions
- How do I undo the most recent local commits in Git?
- How can I remove a specific item from an array in JavaScript?
- How do I delete a Git branch locally and remotely?
- Find all files containing a specific text (string) on Linux?
- How do I revert a Git repository to a previous commit?
- How do I create an HTML button that acts like a link?
- How do I check out a remote Git branch?
- How do I force "git pull" to overwrite local files?
- How do I list all files of a directory?
- How to check whether a string contains a substring in JavaScript?
- How do I redirect to another webpage?
- How can I iterate over rows in a Pandas DataFrame?
- How do I convert a String to an int in Java?
- Does Python have a string 'contains' substring method?
- How do I check if a string contains a specific word?
Open MPI is built on top of a framework of frameworks called Modular Component Architecture (MCA). There are frameworks for different activities such as point-to-point communication, collective communication, parallel I/O, remote process launch, etc. Each framework is implemented as a set of components that provide different implementations of the same public interface.
Whenever the services of a specific framework are requested for the first time, e.g., those of the Byte Transfer Layer (BTL) or the Matching Transport Layer (MTL), both of which transfer messages between the ranks, MCA enumerates through the various components capable of fulfilling the requirements and tries to instantiate them. Some components have specific requirements on their own, e.g., require specific hardware to be present, and fail to instantiate if those aren't met. All components that instantiate successfully are scored and the one with the best score is chosen to carry out the request and other similar requests. Thus, Open MPI is able to adapt itself to different environments will very little configuration on the user side.
For communication between different ranks, the BTL and MTL frameworks provide multiple implementations and the set depends heavily on the Open MPI version and how it was built. The
ompi_info
tool can be used to query the library configuration. This is an example from one of my machines:The different components listed here are:
openib
-- uses InfiniBand verbs to communicate over InfiniBand networks, which is one of the most widespread high-performance communication fabric for clusters nowadays, and other RDMA-capable networks such as iWARP or RoCEsm
-- uses shared memory to communicate on the same nodetcp
-- uses TCP/IP to communicate over any network that provides a sockets interfacevader
-- similarly tosm
, provides shared memory communication on the same nodeself
-- provides efficient self-communicationpsm
-- uses the PSM library to communicate over networks derived from PathScale's InfiniBand variant, such as Intel Omni-Path (r.i.p.)ofi
-- alternative InfiniBand transport that uses OpenFabrics Interfaces (OFI) instead of verbsThe first time rank A on
hostA
wants to talk to rank B onhostB
, Open MPI will go through the list of modules.self
only provides self-communication and will be excluded.sm
andvader
will get excluded since they only provide communication on the same node. If your cluster is not equipped with a high-performance network, the most likely candidate to remain istcp
, because there is literally no cluster node that doesn't have some kind of Ethernet connection to it.The
tcp
component probes all network interfaces that are up and notes their network addresses. It opens listening TCP ports on all of them and publishes this information on a central repository (usually managed by thempiexec
process used to launch the MPI program). When theMPI_Send
call in rank A requests the services oftcp
in order to a send message to rank B, the component looks up the information published by rank B and selects all IP addresses that are in any of the networks thathostA
is part of. It then tries to create one or more TCP connections and upon success the messages start flowing.In most cases, you do not need to configure anything and the
tcp
component Just Works™. Sometimes though it may need some additional configuration. For example, the default TCP port range may be blocked by a firewall and you may need to tell it to use a different one. Or it may select network interfaces that have the same network range, but do not provide physical connectivity - typical case are the virtual interfaces used by the various hypervisors or container services. In this case, you have to telltcp
to exclude those interfaces.Configuring the various MCA components is done by passing in MCA parameters with the
--mca param_name param_value
command-line argument ofmpiexec
. You may query the list or parameters that a given MCA component has and their default values withompi_info --param framework component
. For example:Parameters have different levels and by default
ompi_info
only shows parameters of level 1 (user/basic parameters). This can be changed with the--level N
argument to show parameters up to levelN
. The levels go all the way up to 9 and those with higher levels are only required in very advanced cases, such as fine-tuning the library or debugging issues. For example,btl_tcp_port_min_v4
andbtl_tcp_port_range_v4
, which are used in tandem to specify the port range for TCP connections, are parameters of level 2 (user/detail).