Open MPI logo

Open MPI Development Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Development mailing list

Subject: Re: [OMPI devel] -mca coll "ml" cause segv or hangs with different command lines.
From: Ralph Castain (rhc_at_[hidden])
Date: 2014-03-04 06:06:54

Ummm...the "ml" stands for Mellanox. This is a component you folks
contributed at some time. IIRC, the hcoll and/or bcol are meant to replace
it, but you folks would know best what to do with it.

On Tue, Mar 4, 2014 at 12:12 AM, Elena Elkina <elena.elkina_at_[hidden]>wrote:

> Hi,
> Recently I often meet hangs and seg faults with different command lines
> and there are "ml" functions in the stack trace.
> When I just turn "ml" off by do -mca coll ^ml, problems disappear.
> For example,
> oshrun -np 4 --map-by node --display-map ./ring_oshmem
> fails with seg fault while
> oshrun -np 4 --map-by node --display-map -mca coll ^ml ./ring_oshmem
> passes.
> The "ml" priority is low (27), but it could have issues during comm_query
> (it does all initialization staff there).
> "Ml" is unreliable component. So It may be reasonable do not to build this
> component by default to avoid such problems.
> What do you think?
> Best regards,
> Elena
> _______________________________________________
> devel mailing list
> devel_at_[hidden]
> Subscription:
> Searchable archives: