Open MPI logo

Open MPI User's Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Open MPI User's mailing list

Subject: Re: [OMPI users] SLURM environment variables at runtime
From: Ralph Castain (rhc_at_[hidden])
Date: 2011-02-24 20:25:55


I guess I wasn't clear earlier - I don't know anything about how HP-MPI
works. I was only theorizing that perhaps they did something different that
results in some other slurm vars showing up in Brent's tests. From Brent's
comments, I guess they don't - but they launch jobs in a different manner
that results in some difference in the slurm envars seen by application
procs.

I don't believe we have a bug in OMPI. What we have is behavior that
reflects how the proc is launched. If an app has integrated itself tightly
with slurm, then OMPI may not be a good choice - or they can try the
"slurm-direct" launch method in 1.5 and see if that meets their needs.

There may be something going on with slurm 2.2.x - as I've said before,
slurm makes major changes in even minor releases, and trying to track them
is a nearly impossible task, especially as many of these features are
configuration dependent. What we have in OMPI is the level of slurm
integration required by the three DOE weapons labs as (a) they represent the
largest component of the very small slurm community, and (b) in the past,
they provided the majority of the slurm integration effort within ompi. It
works as they need it to, given the way they configure slurm (which may not
be how others do).

I'm always willing to help other slurm users, but within the guidelines
expressed in an earlier thread - the result must be driven by the DOE
weapons lab's requirements, and cannot interfere with their usage models.

As for slurm_procid - if an application is looking for it, it sounds like
that OMPI may not be a good choice for them. Under OMPI, slurm does not see
the application procs and has no idea they exist. Slurm's knowledge of an
OMPI job is limited solely to the daemons. This has tradeoffs, as most
design decisions do - in the case of the DOE labs, the tradeoffs were judged
favorable...at least, as far as LANL was concerned, and they were my boss
when I wrote the code :-) At LLNL's request, I did create the ability to run
jobs directly under srun - but as Jeff noted, with reduced capability.

Hope that helps clarify what is in the code, and why. I'm not sure what
motivated the original question, but hopefully ompi's slurm support is a
little bit clearer?

Ralph

On Thu, Feb 24, 2011 at 2:08 PM, Jeff Squyres <jsquyres_at_[hidden]> wrote:

> On Feb 24, 2011, at 2:59 PM, Henderson, Brent wrote:
>
> > [snip]
> > They really can't be all SLURM_PROCID=0 - that is supposed to be unique
> for the job - right? It appears that the SLURM_PROCID is inherited from the
> orted parent - which makes a fair amount of sense given how things are
> launched.
>
> That's correct, and I can agree with your sentiment.
>
> However, our design goals were to provide a consistent *Open MPI*
> experience across different launchers. Providing native access to the actual
> underlying launcher was a secondary goal. Balancing those two, you can see
> why we chose the model we did: our orted provides (nearly) the same
> functionality across all environments.
>
> In SLURM's case, we propagate a [seemingly] non-sensical SLURM_PROCID
> values to the individual processes, but only if you are making an assumption
> about how Open MPI is using SLURM's launcher.
>
> More specifically, our goal is to provide consistent *Open MPI information*
> (e.g., through the OMPI_COMM_WORLD* env variables) -- not emulate what SLURM
> would have done if MPI processes had been launched individually through
> srun. Even more specifically: we don't think that the exact underlying
> launching mechanism that OMPI uses is of interest to most users; we
> encourage them to use our portable mechanisms that work even if they move to
> another cluster with a different launcher. Admittedly, that does make it a
> little more challenging if you have to support multiple MPI implementations,
> and although that's an important consideration to us, it's not our first
> priority.
>
> > Now to answer the other question - why are there some variables missing.
> It appears that when the orted processes are launched - via srun but only
> one per node, it is a subset of the main allocation and thus some of the
> environment variables are not the same (or missing entirely) as compared to
> launching them directly with srun on the full allocation. This also makes
> sense to me at some level, so I'm at peace with it now. :)
>
> Ah, good.
>
> > Last thing before I go. Please let me apologize for not being clear on
> what I disagreed with Ralph about in my last note. Clearly he nailed the
> orted launching process and spelled it out very clearly, but I don't believe
> that HP-MPI is not doing anything special to copy/fix up the SLURM
> environment variables. Hopefully that was clear by the body of that
> message.
>
> No worries; you were perfectly clear. Thanks!
>
> --
> Jeff Squyres
> jsquyres_at_[hidden]
> For corporate legal information go to:
> http://www.cisco.com/web/about/doing_business/legal/cri/
>
>
> _______________________________________________
> users mailing list
> users_at_[hidden]
> http://www.open-mpi.org/mailman/listinfo.cgi/users
>