Open MPI logo

Open MPI Development Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Development mailing list

Subject: Re: [OMPI devel] RFC: Force Slurm to use PMI-1 unless PMI-2 is specifically requested
From: Ralph Castain (rhc_at_[hidden])
Date: 2014-05-07 22:54:25

On May 7, 2014, at 6:15 PM, Christopher Samuel <samuel_at_[hidden]> wrote:

> Hash: SHA1
> Hi all,
> Apologies for having dropped out of the thread, night intervened here. ;-)
> On 08/05/14 00:45, Ralph Castain wrote:
>> Okay, then we'll just have to develop a workaround for all those
>> Slurm releases where PMI-2 is borked :-(
> Do you know what these releases are? Are we talking 2.6.x or 14.03?
> The 14.03 series has had a fair few rapid point releases and doesn't
> appear to be anywhere as near as stable as 2.6 was when it came out. :-(

Yeah :-(

I think there was one 2.6.x that was borked, and definitely problems in the 14.03.x line. Can't pinpoint it for you, though.

>> FWIW: I think people misunderstood my statement. I specifically
>> did *not* propose to *lose* PMI-2 support. I suggested that we
>> change it to "on-by-request" instead of the current "on-by-default"
>> so we wouldn't keep getting asked about PMI-2 bugs in Slurm. Once
>> the Slurm implementation stabilized, then we could reverse that
>> policy.
>> However, given that both you and Chris appear to prefer to keep it
>> "on-by-default", we'll see if we can find a way to detect that
>> PMI-2 is broken and then fall back to PMI-1.
> My intention was to provide the data that led us to want PMI2, but if
> configure had an option to enable PMI2 by default so that only those
> who requested it got it then I'd be more than happy - we'd just add it
> to our script to build it.

Sounds good. I'm going to have to dig deeper into those numbers, though, as they don't entirely add up to me. Once the job gets launched, the launch method itself should have no bearing on computational speed - IF all things are equal. In other words, if the process layout is the same, and the binding pattern is the same, then computational speed should be roughly equivalent regardless of how the procs were started.

My guess is that your data might indicate a difference in the layout and/or binding pattern as opposed to PMI2 vs mpirun. At the scale you mention later in the thread (only 70 nodes x 16 ppn), the difference in launch timing would be zilch. So I'm betting you would find (upon further exploration) that (a) you might not have been binding processes when launching by mpirun, since we didn't bind by default until the 1.8 series, but were binding under direct srun launch, and (b) your process mapping would quite likely be different as we default to byslot mapping, and I believe srun defaults to bynode?

Might be worth another comparison run when someone has time.

> All the best!
> Chris
> - --
> Christopher Samuel Senior Systems Administrator
> VLSCI - Victorian Life Sciences Computation Initiative
> Email: samuel_at_[hidden] Phone: +61 (0)3 903 55545
> Version: GnuPG v1.4.14 (GNU/Linux)
> Comment: Using GnuPG with Thunderbird -
> 7uIAnAruTnXZBn6HXhuMAlzzSsoKkXlt
> =OvH4
> _______________________________________________
> devel mailing list
> devel_at_[hidden]
> Subscription:
> Link to this post: