On Jun 24, 2006, at 1:19 PM, George Bosilca wrote:
> As your cluster have several network devices that are supported by
> Open MPI it is possible that the configure script detected the
> correct path to their libraries. Therefore, they might be included/
> compiled by default in Open MPI. The simplest way to check is to use
> the ompi_info tool. "ompi_info | grep btl" will list all the network
> devices supported by your particular build.
>
> If several devices (called BTL in Open MPI terms) are compiled in,
> only forcing one eth interface for the TCP BTL is not enough. You
> should specify that you want only the TCP BTL to be used, forcing
> Open MPI to unload/ignore all other available BTL. Add "--mca btl
> tcp,self" to your mpirun command and the problem should be solved.
I've looked through the documentation but I haven't found the
discussion about what each BTL device is, for example, I have:
MCA btl: self (MCA v1.0, API v1.0, Component v1.2)
MCA btl: sm (MCA v1.0, API v1.0, Component v1.2)
MCA btl: tcp (MCA v1.0, API v1.0, Component v1.0)
I found a PDF presentation that describes a few:
tcp - TCP/IP
openib Infiniband OpenIB Stack
gm/mx- Myrinet GM/MX
mvapi - Infiniband Mellanox Verbs
sm - Shared Memory
Are there any others I may see when interacting with other people's
computers?
I assume that if a machine has Myrinet and I don't see MCA btl: gm or
MCA btl: mx then I have to explain the problem to the sysadm's.
The second question is should I see both gm & mx, or only one or the
other.
Michael
|