Open MPI logo

Open MPI User's Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Open MPI User's mailing list

From: Brignone, Sergio (sbrignone_at_[hidden])
Date: 2006-03-07 17:24:40


Edgar, here is the F77 version of the source code.

Thanks

Sergio

-----Original Message-----
From: Edgar Gabriel [mailto:gabriel_at_[hidden]]
Sent: Tuesday, March 07, 2006 12:09 PM
To: Open MPI Users
Subject: Re: [OMPI users] Spawn and Disconnect

I know that there was a bug in the F90 interface of spawn-multiple,
however (which is fixed by now as far as I can tell). Could you send me
the f77 example which you have? The concatination problem looks strange,

I would like to have a look at it...

Thanks
Edgar

Brignone, Sergio wrote:

> Thanks Edgar, Ralph and Jean.
>
> It seems to me that the problem I am having is related to the
operating
> system or MPI configuration or compiler or all of them (I am using
> Solaris).
>
> For example, the F90 as well as the C++ interfaces could not be
compiled
> (I had to configure MPI without them).
>
> I converted Jean's example to F77 and tested. It didn't work (off
> course, you can always claim that I didn't convert them right ...); in
> fact it seems I got errors in the Fortran to C conversion of strings
> (the program fils1 exists but notice the error: it concatenates all
> strings. This looks to me that the F to C conversion is not correct).
> So I am assuming that the problems are related to my particular
> environment.
>
> I will debug and see what the problem is.
>
> Thanks for your help.
>
> Sergio Brignone
>
>
>
> bash-2.03$ perem
> PR : rank = 0 size = 1
> PR : I am running on PE 0
> PR : I am before the spawning of fils1 on PE 1
>
------------------------------------------------------------------------
> --
> Could not execute the executable "./fils1 ./fils2 ./fils3 ./fils4 ":
No
> such file or directory
>
> This could mean that your PATH or executable name is wrong, or that
you
> do not
> have the necessary permissions. Please ensure that the executable is
> able to be
> found and executed.
>
>
------------------------------------------------------------------------
> --
>
>
>
> -----Original Message-----
> From: Jean Latour [mailto:latour_at_[hidden]]
> Sent: Friday, March 03, 2006 1:50 AM
> To: rhc_at_[hidden]; Open MPI Users
> Subject: Re: [OMPI users] Spawn and Disconnect
>
> Just to add an example that may help to this "disconnect" discussion
:
> Attached is the code of a test that does the following (and it works
> perfectly with OpenMPI 1.0.1)
>
> 1) master spawns slave1
> 2) master spawns slave2
> 3) exechange messages between master and slaves over
intercommunicator
> 4) slave1 disconnects from master and finalize
> 5) slave2 disconnects from master and finalize
> (the processors used by slave 1 and slave 2 can now be re-used by new
> spawned processes)
> 6) master spawns slave3, and then slave4
> 7) slave3 and slave4 have NO direct communicator, but they can create

> one through the Open-Port
> mechanism and the MPI_Connect / MPI_Accept functions.
> The port number is relayed through the master.
> 8) slave3 and slave4 create this direct communicator and do some
> pingpong over it
> 9) slave3 and slave4 disconnect from each other on this direct
> communicator
> 10) slave3 and slave4 disconnect from master an finalize
> 11) master finalize
>
> Hope it helps
> Best regards,
> Jean Latour
>
> Ralph Castain wrote:
>
>
>>We expect to have much better support for the entire comm_spawn
>>process in the next incarnation of the RTE. I don't expect that to be
>>included in a release, however, until 1.1 (Jeff may be able to give
>>you an estimate for when that will happen).
>>
>>Jeff et al may be able to give you access to an early non-release
>>version sooner, if better comm_spawn support is a critical issue and
>>you don't mind being patient with the inevitable bugs in such
>
> versions.
>
>>Ralph
>>
>>
>>Edgar Gabriel wrote:
>>
>>
>>>Open MPI currently does not fully support a proper disconnection of
>>>parent and child processes. Thus, if a child dies/aborts, the parents

>>>will abort as well, despite of calling MPI_Comm_disconnect. (The new
>
> RTE
>
>>>will have better support for these operations, Ralph/Jeff can
probably
>
>
>>>give a better estimate when this will be available.)
>>>
>>>However, what should not happen is, that if the child calls
>
> MPI_Finalize
>
>>>(so not a violent death but a proper shutdown), the parent goes down
>
> at
>
>>>the same time. Let me check that as well...
>>>
>>>Brignone, Sergio wrote:
>>>
>>>
>>>
>>>
>>>>Hi everybody,
>>>>
>>>>
>>>>
>>>>I am trying to run a master/slave set.
>>>>
>>>>Because of the nature of the problem I need to start and stop (kill)

>>>>some slaves.
>>>>
>>>>The problem is that as soon as one of the slave dies, the master
dies
>
> also.
>
>>>>
>>>>
>>>>This is what I am doing:
>>>>
>>>>
>>>>
>>>>MASTER:
>>>>
>>>>
>>>>
>>>>MPI_Init(...)
>>>>
>>>>
>>>>
>>>>MPI_Comm_spawn(slave1,...,nslave1,...,intercomm1);

>>>>MPI_Barrier(intercomm1);
>>>>MPI_Comm_disconnect(&intercomm1);
>>>>MPI_Comm_spawn(slave2,...,nslave2,...,intercomm2);
>>>>MPI_Barrier(intercomm2);
>>>>MPI_Comm_disconnect(&intercomm2);
>>>>MPI_Finalize();
>>>>SLAVE:
>>>>MPI_Init(...)
>>>>MPI_Comm_get_parent(&intercomm);
>>>>(does something)
>>>>MPI_Barrier(intercomm);
>>>>MPI_Comm_disconnect(&intercomm);
>>>>MPI_Finalize();
>>>>The issue is that as soon as the first set of slaves calls
>
> MPI_Finalize,
>
>>>>the master dies also (it dies right after
>
> MPI_Comm_disconnect(&intercomm1) )
>
>>>>What am I doing wrong?
>>>>Thanks
>>>>Sergio