Open MPI logo

Open MPI Development Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Development mailing list

Subject: Re: [OMPI devel] Intercomm Merge
From: Jeff Squyres (jsquyres) (jsquyres_at_[hidden])
Date: 2013-09-18 05:22:39


On Sep 18, 2013, at 9:33 AM, George Bosilca <bosilca_at_[hidden]> wrote:

> 1. sm doesn't work between spawned processes. So you must have another network enabled.

I know :-). I have tcp available as well (OMPI will abort if you only run with sm,self because the comm_spawn will fail with unreachable errors -- I just tested/proved this to myself).

> 2. Don't use the test case attached to my email, I left an xterm based spawn and the debugging. It can't work without xterm support. Instead try using the test case from the trunk, the one committed by Ralph.

I didn't see any "xterm" strings in there, but ok. :-) I ran with orte/test/mpi/intercomm_create.c, and that hangs for me as well:

-----
❯❯❯ mpicc intercomm_create.c -o intercomm_create
❯❯❯ mpirun -np 4 intercomm_create
b: MPI_Intercomm_create( intra, 0, intra, MPI_COMM_NULL, 201, &inter) [rank 4]
b: MPI_Intercomm_create( intra, 0, intra, MPI_COMM_NULL, 201, &inter) [rank 5]
b: MPI_Intercomm_create( intra, 0, intra, MPI_COMM_NULL, 201, &inter) [rank 6]
b: MPI_Intercomm_create( intra, 0, intra, MPI_COMM_NULL, 201, &inter) [rank 7]
c: MPI_Intercomm_create( MPI_COMM_WORLD, 0, intra, 0, 201, &inter) [rank 4]
c: MPI_Intercomm_create( MPI_COMM_WORLD, 0, intra, 0, 201, &inter) [rank 5]
c: MPI_Intercomm_create( MPI_COMM_WORLD, 0, intra, 0, 201, &inter) [rank 6]
c: MPI_Intercomm_create( MPI_COMM_WORLD, 0, intra, 0, 201, &inter) [rank 7]
a: MPI_Intercomm_create( ab_intra, 0, ac_intra, 0, 201, &inter) (0)
a: MPI_Intercomm_create( ab_intra, 0, ac_intra, 0, 201, &inter) (0)
a: MPI_Intercomm_create( ab_intra, 0, ac_intra, 0, 201, &inter) (0)
a: MPI_Intercomm_create( ab_intra, 0, ac_intra, 0, 201, &inter) (0)
[hang]
-----

Similarly, on my Mac, it hangs with no output:

-----
❯❯❯ mpicc intercomm_create.c -o intercomm_create
❯❯❯ mpirun -np 4 intercomm_create
[hang]
-----

> George.
>
> On Sep 18, 2013, at 07:53 , "Jeff Squyres (jsquyres)" <jsquyres_at_[hidden]> wrote:
>
>> George --
>>
>> When I build the SVN trunk (r29201) on 64 bit linux, your attached test case hangs:
>>
>> -----
>> ❯❯❯ mpicc intercomm_create.c -o intercomm_create
>> ❯❯❯ mpirun -np 4 intercomm_create
>> b: MPI_Intercomm_create( intra, 0, intra, MPI_COMM_NULL, 201, &inter) [rank 4]
>> b: MPI_Intercomm_create( intra, 0, intra, MPI_COMM_NULL, 201, &inter) [rank 5]
>> b: MPI_Intercomm_create( intra, 0, intra, MPI_COMM_NULL, 201, &inter) [rank 6]
>> b: MPI_Intercomm_create( intra, 0, intra, MPI_COMM_NULL, 201, &inter) [rank 7]
>> a: MPI_Intercomm_create( ab_intra, 0, ac_intra, 0, 201, &inter) (0)
>> a: MPI_Intercomm_create( ab_intra, 0, ac_intra, 0, 201, &inter) (0)
>> a: MPI_Intercomm_create( ab_intra, 0, ac_intra, 0, 201, &inter) (0)
>> a: MPI_Intercomm_create( ab_intra, 0, ac_intra, 0, 201, &inter) (0)
>> c: MPI_Intercomm_create( MPI_COMM_WORLD, 0, intra, 0, 201, &inter) [rank 4]
>> c: MPI_Intercomm_create( MPI_COMM_WORLD, 0, intra, 0, 201, &inter) [rank 5]
>> c: MPI_Intercomm_create( MPI_COMM_WORLD, 0, intra, 0, 201, &inter) [rank 6]
>> c: MPI_Intercomm_create( MPI_COMM_WORLD, 0, intra, 0, 201, &inter) [rank 7]
>> [hang]
>> -----
>>
>> On my Mac, it hangs without printing anything:
>>
>> -----
>> ❯❯❯ mpicc intercomm_create.c -o intercomm_create
>> ❯❯❯ mpirun -np 4 intercomm_create
>> [hang]
>> -----
>>
>>
>> On Sep 18, 2013, at 1:48 AM, George Bosilca <bosilca_at_[hidden]> wrote:
>>
>>> Here is a quick (and definitively not the cleanest) patch that addresses the MPI_Intercomm issue at the MPI level. It should be applied after removal of 29166.
>>>
>>> I also added the corrected test case stressing the corner cases by doing barriers at every inter-comm creation and doing a clean disconnect.
>>
>>
>> --
>> Jeff Squyres
>> jsquyres_at_[hidden]
>> For corporate legal information go to: http://www.cisco.com/web/about/doing_business/legal/cri/
>>
>> _______________________________________________
>> devel mailing list
>> devel_at_[hidden]
>> http://www.open-mpi.org/mailman/listinfo.cgi/devel
>
> _______________________________________________
> devel mailing list
> devel_at_[hidden]
> http://www.open-mpi.org/mailman/listinfo.cgi/devel


--
Jeff Squyres
jsquyres_at_[hidden]
For corporate legal information go to: http://www.cisco.com/web/about/doing_business/legal/cri/