Open MPI logo

Open MPI Development Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Development mailing list

From: Ralph Castain (rhc_at_[hidden])
Date: 2007-07-17 08:39:10

On 7/17/07 5:37 AM, "Jeff Squyres" <jsquyres_at_[hidden]> wrote:

> On Jul 16, 2007, at 2:28 PM, Matthew Moskewicz wrote:
>>> MPI-2 does support the MPI_COMM_JOIN and MPI_COMM_ACCEPT/
>>> MPI_COMM_CONNECT models. We do support this in Open MPI, but the
>>> restrictions (in terms of ORTE) may not be sufficient for you.
>> perhaps i'll experiment -- any clues as to what the orte
>> restrictions might be?
> The main constraint is that you have to run a "persistent" orted that
> will span all your MPI_COMM_WORLD's. We have only lightly tested
> this scenario -- Ralph, can you comment more here?

Actually, I'm not convinced Open MPI really supports either of those two MPI
semantics. It is true that we have something in our code repository, but I'm
not convinced it actually does what people think.

There are two use-cases one must consider:

1. an application code spawns another job and then at some later point wants
to connect to it. Our current implementation of comm_spawn does this
automatically via the accept/connect procedure, so we have this covered.
However, it relies upon the notion that (a) the parent job *knows* the jobid
of the child, and (b) the parent sends a message to the child telling it
where and how to rendezvous with it. You don't need the persistent daemon

2. a user starts one application, and then starts another (would have to be
in a separate window or batch job as we do not support running mpirun in the
background) that connects to the first. The problem here is that neither
application knows the jobid of the other, has no info on how to communicate
with the other, nor knows a common rendezvous point. You would definitely
need a persistent daemon for this use-case.

I would have to review the code to see, but my best guess from what I
remember is that we don't actually support the second use-case at this time.
It would be possible to do so, albeit complicated - but I'm almost certain
nobody ever implemented it. I had talked at one time about providing the
necessary glue, either at the command line or (better) via some internal
"magic", but never got much interest - and so never did anything about
it...and I don't recall seeing anyone else make the necessary changes.

>>> - It also likely doesn't work yet; we started the integration work
>>> and ran into a technical issue that required further discussion with
>>> Platform. They're currently looking into it; we stopped the LSF work
>>> in ORTE until they get back to us.
>> i see -- i might be trying to work on the 6.x support today. can you
>> give me any hints on what the problem was in case i run into the same
>> issue?
> Something was wrong with the lsb_launch() function; using it caused a
> significant slowdown in the job and it generally wasn't behaving as
> expected. Platform issued a fix for me yesterday (i.e., a one-off/
> unsupported binary for development purposes) that I haven't gotten to
> test yet.
>>> - That being said, MPI_THREAD_MULTIPLE and MPI_COMM_SPAWN *might*
>>> offer a way out here. But I think a) THREAD_MULTIPLE isn't working
>>> yet (other OMPI members are working on this), and b) even when
>>> THREAD_MULTIPLE works, there will be ORTE issues to deal with
>>> (canceling pending resource allocations, etc.). Ralph mentioned that
>>> someone else is working on such things on the TM/PBS/Torque side; I
>>> haven't followed that effort closely.
>> it seems that MPI_THREAD_MULTIPLE is to be avoided for now, but there
>> are perhaps other workarounds (using threads in other ways, etc.).
>> also, i'd love to hear about the existing efforts -- i'm hoping
>> someone working on them might be reading this ... ;)
> Ralph -- can you chime in on the TM/PBS/Torque efforts?

It isn't my work. I can ask the other developer if he is interested in
talking with you and/or willing for me to make his work more public (part of
it has been discussed on the public user list). I believe this is part of
his PhD thesis, so I want to err on the side of caution here.