> The trouble is when I try to add some "--mca" parameters to force it to
> use TCP/Ethernet, the program seems to hang. I get the headers of the
> "osu_bw" output, but no results, even on the first case (1 byte payload
> per packet). This is occurring on both the IB-enabled nodes, and on the
> Ethernet-only nodes. The specific syntax I was using was: "mpirun
> --mca btl ^openib --mca btl_tcp_if_exclude ib0 ./osu_bw"
When we want to run over TCP and IPoIB on an IB/PSM equipped cluster, we use:
--mca btl sm --mca btl tcp,self --mca btl_tcp_if_exclude eth0 --mca btl_tcp_if_include ib0 --mca mtl ^psm
based on this, it looks like the following might work for you:
--mca btl sm,tcp,self --mca btl_tcp_if_exclude ib0 --mca btl_tcp_if_include eth0 --mca btl ^openib
If you don't have ib0 ports configured on the IB nodes, probably you don't need the" --mca btl_tcp_if_exclude ib0."
> The problem occurs at least with OpenMPI 1.6.3 compiled with GNU 4.4
> compilers, with 1.6.3 compiled with Intel 13.0.1 compilers, and with
> 1.6.5 compiled with Intel 13.0.1 compilers. I haven't tested any other
> combinations yet.
> Any ideas here? It's very possible this is a system configuration
> problem, but I don't know where to look. At this point, any ideas would
> be welcome, either about the specific situation, or general pointers on
> mpirun debugging flags to use. I can't find much in the docs yet on
> run-time debugging for OpenMPI, as opposed to debugging the application.
> Maybe I'm just looking in the wrong place.
> Lloyd Brown
> Systems Administrator
> Fulton Supercomputing Lab
> Brigham Young University
> users mailing list