Open MPI logo

Open MPI User's Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Open MPI User's mailing list

Subject: Re: [OMPI users] orted 1.6.4 and 1.8.1 segv with bonded Cisco P81E
From: Vineet Rawat (vineetrawat0_at_[hidden])
Date: 2014-06-09 18:49:17


On Mon, Jun 9, 2014 at 3:31 PM, Jeff Squyres (jsquyres) <jsquyres_at_[hidden]>
wrote:

> On Jun 9, 2014, at 5:41 PM, Vineet Rawat <vineetrawat0_at_[hidden]> wrote:
>
> > We've deployed OpenMPI on a small cluster but get a SEGV in orted. Debug
> information is very limited as the cluster is at a remote customer site.
> They have a network card with which I'm not familiar (Cisco Systems Inc VIC
> P81E PCIe Ethernet NIC) and it seems capable of using the usNIC BTL.
>
> Unfortunately, this is the 1st generation Cisco VIC -- our usNIC BTL is
> only enabled starting with the 2nd generation Cisco VIC (the 12xx series,
> not the Pxxx series).
>
> So runs over this Ethernet NIC should be using just plain ol' TCP.
>

OK, that should be fine here.

> > I'm suspicious that it might be at the root of the problem. They're also
> bonding the 2 ports.
>
> FWIW, it's not necessary to bond the interfaces for Open MPI -- meaning
> that Open MPI will automatically stripe large messages across multiple IP
> interfaces, etc. So if they're bonding for the purposes of MPI bandwidth,
> you can tell them to turn off the bonding.
>

They said they're doing it for resilience, not bandwidth.

>
> Also note that, by default, Open MPI's TCP MPI transport will aggressively
> use *all* IP interfaces that it finds. So in your case, it will likely use
> bond0, eth0, *and* eth1. Meaning: OMPI can effectively oversubscribe the
> network coming out of each VIC. You might want to set a system-wide
> default MCA parameter to have OMPI not use the bond0 interface. For
> example, add this line to $prefix/etc/mca-params.conf:
>
> btl_tcp_if_include = eth0,eth1
>
> This will have OMPI *only* use eth0 and eth1 -- it'll ignore lo and bond0.
>

OK, will do.

>
> > However, we're also doing a few unusual things which could be causing
> problems. Firstly, we built OpenMPI (I tried 1.6.4 and 1.8.1) without the
> ibverbs or usnic BTLs. Then, we only ship what (we think) we need: otrerun,
> orted, libmpi, libmpi_cxx, libopen-rte and libopen-pal. Could there be a
> dependency on some other binary executable or dlopen'ed library? We also
> use a special plm_rsh_agent but we've used this approach for some time
> without issue.
>
> All that sounds fine.
>
> Open MPI 1.8.1 is preferred; the 1.6.x series is pretty old at this point.
> If there's a bug in 1.8.1, it's a whole lot easier for us to fix it in the
> 1.8.x series.
>

Yes, we've been deploying 1.6.4 for a while and are wary of change. We only
went to 1.8.1 to see if it changed anything related to this issue. I
completely understand that any fixes, if needed, are likely to go in the
latest version.

>
> > I tried a few different MCA settings, the most restrictive of which led
> to the failure of this command:
> >
> > orted --debug --debug-daemons -mca ess env -mca orte_ess_jobid
> 1925054464 -mca orte_ess_vpid 1 -mca orte_ess_num_procs 2 -mca orte_hnp_uri
> \"1925054464.0;tcp://10.xxx.xxx.xxx:40547\" --tree-spawn --mca
> orte_base_help_aggregate 1 --mca plm_rsh_agent yyy --mca
> btl_tcp_port_min_v4 2000 --mca btl_tcp_port_range_v4 100 --mca btl tcp,self
> --mca btl_tcp_if_include bond0 --mca orte_create_session_dirs 0 --mca
> plm_rsh_assume_same_shell 0 -mca plm rsh -mca orte_debug_daemons 1 -mca
> orte_debug 1 -mca orte_tag_output 1
> >
> > It seems that the host is set up such that the core file is generated
> and immediately removed ("ulimit -c" is unlimited) but the abrt daemon is
> doing something weird.
>
> As Ralph mentioned, can you verify that the correct version MPI libraries
> are being picked up on the remote servers? E.g., is LD_LIBRARY_PATH being
> set properly in the shell startup files on the remote servers (e.g., to
> find the 1.8.1 shared libraries)?
>
> Also make sure that you install each version of Open MPI into a "clean"
> directory -- don't install OMPI 1.6.x into /foo and then install OMPI 1.8.x
> info /foo, too. The two versions are incompatible with each other, and
> have conflicting/not-wholly-overlapping libraries. Meaning: if you install
> OMPI 1.6.x into /foo, you should either "rm -rf /foo" before you install
> OMPI 1.8.x into /foo, or just install OMPI 1.8.x into /bar.
>

The installations are entirely separate. The LD_LIBRARY_PATH is set up by
our own launch wrapper and I'm confident it's correct.

Vineet

> Make sense?

> --
> Jeff Squyres
> jsquyres_at_[hidden]
> For corporate legal information go to:
> http://www.cisco.com/web/about/doing_business/legal/cri/
>
> _______________________________________________
> users mailing list
> users_at_[hidden]
> http://www.open-mpi.org/mailman/listinfo.cgi/users
>