Open MPI logo

Open MPI User's Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Open MPI User's mailing list

Subject: Re: [OMPI users] "Partial" Reduce and overlapping communicator
From: Mathieu westphal (mathieu.westphal_at_[hidden])
Date: 2012-04-06 08:11:24


Hello

Thanks for your help.

MPI_UNDEFINED lead me to a better understanding and control on all my
communicators. I used now only MPI_Comm_Split (before i was trying with
MPI_Group_incl which complicated things )

Others errors were caused by a non-mpi-related mistake.

It works well now, thanks.

Mathieu
On 04/05/2012 04:21 PM, George Bosilca wrote:
> Mathieu,
>
> All communicator creation function in the MPI 2.2 standard are
> collective over the original communicator. For your specific case this
> means all processes in the worker communicator must call the
> communicator creation functions.
>
> As this is true in all cases, and as a communicator creation function
> can return only a communicator per rank, if you want to create
> overlapping communicators the communicator creation function should be
> called as many times as there are overlaps by all processes in the
> original communicator.
>
> Based on my understanding of what you did, the first MPI_Comm_split is
> correct. For creating the second communicator, either you replace the
> second call (MPI_Comm_create) by a call to MPI_Comm_split with 2 and 4
> using a key=MPI_UNDEFINED, or you force all your workers to call the
> MPI_Comm_create the same group with 2 and 4 using MPI_GROUP_EMPTY.
>
> However, based on the description of your issues I don't think is the
> right solution. If you know that each worker will execute the same
> number of tasks, i.e., you need the exact same number of MPI_Reduce,
> you might want to look into the non-blocking collective proposed in
> MPI 3.0. Otherwise you should implement your own based on non-blocking
> point-to-point communications.
>
> george.
>
> On Apr 5, 2012, at 06:02 , Mathieu westphal wrote:
>
>> Hello
>>
>> I got a problem with my code, wich run some kinf of a simulator.
>>
>> I get 4 worker (aka 4 mpi process ) wich process data.
>>
>> These data aren't available at the same time, so i get another
>> process (Splitter) wich send chunk of data to each process in round
>> robin.
>>
>> This work well using MPI_Send and Receive but aftet that i need to
>> reduce all the data.
>>
>> I hope to be able to use MPI_Reduce to reduce all data from all
>> worker but there is a problem :
>>
>> 1. All results data aren't available at the same time, dut to the
>> delay from the original data delay.
>> 2. I cannot wait all data to be computed, i need to proceed the
>> reduce a soon as possible
>>
>> So when the first and second worker have finished, i can reduce the
>> two results data and keep the results on the first worker.
>> And when the third and the fourth have finished, i can reduce these
>> two too, and keep results on third worker.
>> At last i need to reduce data from first and third worker.
>>
>> The only way to do that using MPI_Reduce is to create "communicators".
>>
>> All i want is :
>>
>> commA : contain W1 W2
>> commB : contain W3 W4
>> commC : contain W1 W3
>>
>>
>> Let's say i've already create a communicator only for my workers
>> MPI_Comm workerComm;
>> 10 MPI_Comm intComm[8];
>>
>> I can easily add this line in all my workers :
>>
>>
>> MPI_Comm_Split(workers_comm,(int)(workerId/2),rank,CommAlpha)
>>
>> *If i understand well i will get an communicator on W1 and W2 wich
>> contains W1 and W2, and a communicator on W3 and W4 wich contains W3
>> and W4. Am i right?*
>>
>>
>> But next when i try to use (only on W1 and W3):
>>
>> MPI_Create_comm(MPI_Comm workers_comm,group,commC),
>>
>> *I need also to call MPI_Create_comm on W2 and W4 or it will block. Why?*
>>
>> After that, i got lot of non persistent error depending of the number
>> of worker i want to use.
>> *So is this allowed to create and use overlapping communicator? and
>> if so how to do that?*
>>
>> Thanks
>>
>> Mathieu
>> _______________________________________________
>> users mailing list
>> users_at_[hidden] <mailto:users_at_[hidden]>
>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>
>
>
> _______________________________________________
> users mailing list
> users_at_[hidden]
> http://www.open-mpi.org/mailman/listinfo.cgi/users