Open MPI logo

Open MPI User's Mailing List Archives

  |   Home   |   Support   |   FAQ   |  

This web mail archive is frozen.

This page is part of a frozen web archive of this mailing list.

You can still navigate around this archive, but know that no new mails have been added to it since July of 2016.

Click here to be taken to the new web archives of this list; it includes all the mails that are in this frozen archive plus all new mails that have been sent to the list since it was migrated to the new archives.

Subject: Re: [OMPI users] "Partial" Reduce and overlapping communicator
From: Mathieu westphal (mathieu.westphal_at_[hidden])
Date: 2012-04-06 08:11:24


Thanks for your help.

MPI_UNDEFINED lead me to a better understanding and control on all my
communicators. I used now only MPI_Comm_Split (before i was trying with
MPI_Group_incl which complicated things )

Others errors were caused by a non-mpi-related mistake.

It works well now, thanks.

On 04/05/2012 04:21 PM, George Bosilca wrote:
> Mathieu,
> All communicator creation function in the MPI 2.2 standard are
> collective over the original communicator. For your specific case this
> means all processes in the worker communicator must call the
> communicator creation functions.
> As this is true in all cases, and as a communicator creation function
> can return only a communicator per rank, if you want to create
> overlapping communicators the communicator creation function should be
> called as many times as there are overlaps by all processes in the
> original communicator.
> Based on my understanding of what you did, the first MPI_Comm_split is
> correct. For creating the second communicator, either you replace the
> second call (MPI_Comm_create) by a call to MPI_Comm_split with 2 and 4
> using a key=MPI_UNDEFINED, or you force all your workers to call the
> MPI_Comm_create the same group with 2 and 4 using MPI_GROUP_EMPTY.
> However, based on the description of your issues I don't think is the
> right solution. If you know that each worker will execute the same
> number of tasks, i.e., you need the exact same number of MPI_Reduce,
> you might want to look into the non-blocking collective proposed in
> MPI 3.0. Otherwise you should implement your own based on non-blocking
> point-to-point communications.
> george.
> On Apr 5, 2012, at 06:02 , Mathieu westphal wrote:
>> Hello
>> I got a problem with my code, wich run some kinf of a simulator.
>> I get 4 worker (aka 4 mpi process ) wich process data.
>> These data aren't available at the same time, so i get another
>> process (Splitter) wich send chunk of data to each process in round
>> robin.
>> This work well using MPI_Send and Receive but aftet that i need to
>> reduce all the data.
>> I hope to be able to use MPI_Reduce to reduce all data from all
>> worker but there is a problem :
>> 1. All results data aren't available at the same time, dut to the
>> delay from the original data delay.
>> 2. I cannot wait all data to be computed, i need to proceed the
>> reduce a soon as possible
>> So when the first and second worker have finished, i can reduce the
>> two results data and keep the results on the first worker.
>> And when the third and the fourth have finished, i can reduce these
>> two too, and keep results on third worker.
>> At last i need to reduce data from first and third worker.
>> The only way to do that using MPI_Reduce is to create "communicators".
>> All i want is :
>> commA : contain W1 W2
>> commB : contain W3 W4
>> commC : contain W1 W3
>> Let's say i've already create a communicator only for my workers
>> MPI_Comm workerComm;
>> 10 MPI_Comm intComm[8];
>> I can easily add this line in all my workers :
>> MPI_Comm_Split(workers_comm,(int)(workerId/2),rank,CommAlpha)
>> *If i understand well i will get an communicator on W1 and W2 wich
>> contains W1 and W2, and a communicator on W3 and W4 wich contains W3
>> and W4. Am i right?*
>> But next when i try to use (only on W1 and W3):
>> MPI_Create_comm(MPI_Comm workers_comm,group,commC),
>> *I need also to call MPI_Create_comm on W2 and W4 or it will block. Why?*
>> After that, i got lot of non persistent error depending of the number
>> of worker i want to use.
>> *So is this allowed to create and use overlapping communicator? and
>> if so how to do that?*
>> Thanks
>> Mathieu
>> _______________________________________________
>> users mailing list
>> users_at_[hidden] <mailto:users_at_[hidden]>
> _______________________________________________
> users mailing list
> users_at_[hidden]