Open MPI logo

Open MPI Development Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Development mailing list

From: George Bosilca (bosilca_at_[hidden])
Date: 2007-07-27 11:31:03

You were limpid. What we're trying to say here, it's that the
solution you described few emails ago, doesn't work. At least it
doesn't work for what we want to do (i.e. what Aurelien described in
his first email). We [really] need 2 separate MPI worlds, that we
will connect at a later moment, and not one larger MPI world.

Allow me to reiterate on what we are looking for. We want to save
some information (related to fault tolerance but this might be
ignored here), on another MPI application. The user will start his/
her MPI application in exactly the same way as before plus 2 new mca
arguments. One for enabling the message logging approach and one for
the connect/accept port info. Once our internal framework is
initialized in the user application, it will connect to the spare MPI
application (let's call it storage application) (launched by the user
on some specific nodes that have better capabilities as Aurelien
described in his initial email). Now the user application and the
storage one will be able to communicate via MPI, and therefore
getting the best performance out of the available networks. Once the
user application successfully complete, the storage application can
disappear (or not, we will take what's available in Open MPI at that

This approach is not a corner case. It's a completely valid approach
as described in the MPI-2 standard. However, as usual the MPI
standard is not very clear on how to manage the connection
information, so this is the big unknown here.


On Jul 27, 2007, at 11:08 AM, Ralph Castain wrote:

> Guess I was unclear, George - I don't know enough about Aurelien's
> app to
> know if it is capable of (or trying to) run as one job, or not.
> What has been described on this thread to-date is, in fact, a
> corner case.
> Hence the proposal of another way to possibly address a corner case
> without
> disrupting the normal code operation.
> May not be possible, per the other more general thread....
> On 7/27/07 8:31 AM, "George Bosilca" <bosilca_at_[hidden]> wrote:
>> It's not about the app. It's about the MPI standard. With one mpirun
>> you start one MPI application (SPMD or MPMD but still only one). The
>> first impact of this, is all processes started with one mpirun
>> command will belong to the same MPI_COMM_WORLD.
>> Our mpirun is in fact equivalent to the mpiexec as defined in the MPI
>> standard. Therefore, we cannot change it's behavior, outside the MPI
>> 2 standard boundaries.
>> Moreover, both of the approaches you described will only add corner
>> cases, which I rather prefer to limit in number.
>> george.
>> On Jul 27, 2007, at 8:42 AM, Ralph Castain wrote:
>>> On 7/26/07 4:22 PM, "Aurelien Bouteiller" <bouteill_at_[hidden]>
>>> wrote:
>>>>> mpirun -hostfile big_pool -n 10 -host 1,2,3,4 application : -n 2 -
>>>>> host
>>>>> 99,100 ft_server
>>>> This will not work: this is a way to launch MIMD jobs, that
>>>> share the
>>>> same COMM_WORLD. Not the way to launch two different applications
>>>> that
>>>> interact trough Accept/Connect.
>>>> Direct consequence on simple NAS benchmarks are:
>>>> * if the second command does not use MPI-Init, then the first
>>>> application locks forever in MPI-Init
>>>> * if both use MPI init, the MPI_Comm_size of the jobs are
>>>> incorrect.
>>>> ****
>>>> bouteill_at_dancer:~$ ompi-build/debug/bin/mpirun -prefix
>>>> /home/bouteill/ompi-build/debug/ -np 4 -host
>>>> node01,node02,node03,node04
>>>> NPB3.2-MPI/bin/lu.A.4 : -np 1 -host node01 NPB3.2-MPI/bin/mg.A.1
>>>> NAS Parallel Benchmarks 3.2 -- LU Benchmark
>>>> Warning: program is running on 5 processors
>>>> but was compiled for 4
>>>> Size: 64x 64x 64
>>>> Iterations: 250
>>>> Number of processes: 5
>>> Okay - of course, I can't possibly have any idea how your
>>> application
>>> works... ;-)
>>> However, it would be trivial to simply add two options to the
>>> app_context
>>> command line:
>>> 1. designates that this app_context is to be launched as a separate
>>> job
>>> 2. indicates that this app_context is to be "connected" ala connect/
>>> accept
>>> to the other app_contexts (if you want, we could even take an
>>> argument
>>> indicating which app_contexts it is to be connected to). Or we
>>> could reverse
>>> this as indicate we want it to be disconnected - all depends upon
>>> what
>>> default people want to define.
>>> This would solve the problem you describe while still allowing us
>>> to avoid
>>> allocation confusion. I'll send it out separately as an RFC.
>>> Thanks
>>> Ralph
>>>> _______________________________________________
>>>> devel mailing list
>>>> devel_at_[hidden]
>>> _______________________________________________
>>> devel mailing list
>>> devel_at_[hidden]
>> _______________________________________________
>> devel mailing list
>> devel_at_[hidden]
> _______________________________________________
> devel mailing list
> devel_at_[hidden]