Open MPI logo

Open MPI User's Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Open MPI User's mailing list

From: Robert Latham (robl_at_[hidden])
Date: 2006-05-03 18:51:16


On Tue, Mar 14, 2006 at 12:37:52PM -0600, Edgar Gabriel wrote:
> I think I know what goes wrong. Since they are in different 'universes',
> they will have exactly the same 'Open MPI name', and therefore the
> algorithm in intercomm_merge can not determine which process should be
> first and which is second.
>
> Practically, all jobs which are connected at a certain point in there
> lifetime have to be in the same MPI universe, such that all jobs will
> have different jobid's and therefore different names. To use the same
> universe, you have to start the orted daemon in the persistent mode, so
> the sequence should be:
>
> orted --seed --persistent --scope public
> mpirun -np x ./app1
> mpirun -np y ./app2
>
> In this case everything should work as expected, you could do the
> comm_join between app1 and app2 and the intercomm_merge should work as well.
>
> Hope this helps

This was fine on a single machine. What do you recommend for multiple
machines (e.g. app1 on node1 and app2 on node2)? How do i tell
multiple orted instances that they are part of the same universe?

thanks
==rob

-- 
Rob Latham
Mathematics and Computer Science Division    A215 0178 EA2D B059 8CDF
Argonne National Labs, IL USA                B29D F333 664A 4280 315B