Open MPI logo

Open MPI User's Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Open MPI User's mailing list

Subject: Re: [OMPI users] mpirun: specify multiple install prefixes
From: Ralph H Castain (rhc_at_[hidden])
Date: 2007-12-20 08:32:31

I'm afraid not - nor is it in the plans for 1.3 either. I'm afraid it fell
through the cracks as the needs inside the developer community moved into
other channels.

I'll raise the question internally and see if people feel we should do this.
It wouldn't be hard to put it into 1.3 at this point, but will be very hard
to do so if not done very soon.

Thanks for the reminder!

On 12/14/07 9:45 AM, "Pignot Geoffroy" <geopignot_at_[hidden]> wrote:

> Hi,
> I just would like to known if this functionality (a prefix field in
> hostfile if i understand well ) has been integrated in the 1.2.4 ??
> Thanks for your answer
> ------- On Mar 22, 2007, at 10:38 AM, Ralph Castain wrote:
> We had a nice chat about this on the OpenRTE telecon this morning. The
> question of what to do with multiple prefix's has been a long-running
> issue,
> most recently captured in bug trac report #497. The problem is that
> prefix
> is intended to tell us where to find the ORTE/OMPI executables, and
> therefore is associated with a node - not an app_context. What we
> haven't
> been able to define is an appropriate notation that a user can exploit
> to
> tell us the association.
> This issue has arisen on several occasions where either (a) users have
> heterogeneous clusters with a common file system, so the prefix must be
> adjusted on each *type* of node to point to the correct type of
> binary; and
> (b) for whatever reason, typically on rsh/ssh clusters, users have
> installed
> the binaries in different locations on some of the nodes. In this latter
> case, the reports have been from homogeneous clusters, so the *type* of
> binary was never the issue - it just wasn't located where we expected.
> Sun's solution is (I believe) what most of us would expect - they locate
> their executables in the same relative location on all their nodes. The
> binary in that location is correct for that local architecture. This
> requires, though, that the "prefix" location not be on a common file
> system.
> Unfortunately, that isn't the case with LANL's roadrunner, nor can we
> expect
> that everyone will follow that sensible approach :-). So we need a
> notation
> to support the "exception" case where someone needs to truly specify
> prefix
> versus node(s).
> We discussed a number of options, including auto-detecting the local
> arch
> and appending it to the specified "prefix" and several others. After
> discussing them, those of us on the call decided that adding a field
> to the
> hostfile that specifies the prefix to use on that host would be the best
> solution. This could be done on a cluster-level basis, so - although
> it is
> annoying to create the data file - at least it would only have to be
> done
> once.
> Again, this is the exception case, so requiring a little inconvenience
> seems
> a reasonable thing to do.
> Anyone have heartburn and/or other suggestions? If not, we might start
> to
> play with this next week. We would have to do some small modifications
> to
> the RAS, RMAPS, and PLS components to ensure that any multi-prefix
> info gets
> correctly propagated and used across all platforms for consistent
> behavior.
> Ralph
> _______________________________________________
> users mailing list
> users_at_[hidden]