Open MPI logo

Open MPI User's Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Open MPI User's mailing list

Subject: Re: [OMPI users] MPI_Allgather problem
From: teng ma (tma_at_[hidden])
Date: 2011-12-09 13:43:13


I guess your output is from different ranks. YOu can add rank infor
inside print to tell like follows:

(void) printf("rank %d: gathered[%d].node = %d\n", rank, i,
gathered[i].node);

>From my side, I did not see anything wrong from your code in Open MPI
1.4.3. after I add rank, the output is
rank 5: gathered[0].node = 0
rank 5: gathered[1].node = 1
rank 5: gathered[2].node = 2
rank 5: gathered[3].node = 3
rank 5: gathered[4].node = 4
rank 5: gathered[5].node = 5
rank 3: gathered[0].node = 0
rank 3: gathered[1].node = 1
rank 3: gathered[2].node = 2
rank 3: gathered[3].node = 3
rank 3: gathered[4].node = 4
rank 3: gathered[5].node = 5
rank 1: gathered[0].node = 0
rank 1: gathered[1].node = 1
rank 1: gathered[2].node = 2
rank 1: gathered[3].node = 3
rank 1: gathered[4].node = 4
rank 1: gathered[5].node = 5
rank 0: gathered[0].node = 0
rank 0: gathered[1].node = 1
rank 0: gathered[2].node = 2
rank 0: gathered[3].node = 3
rank 0: gathered[4].node = 4
rank 0: gathered[5].node = 5
rank 4: gathered[0].node = 0
rank 4: gathered[1].node = 1
rank 4: gathered[2].node = 2
rank 4: gathered[3].node = 3
rank 4: gathered[4].node = 4
rank 4: gathered[5].node = 5
rank 2: gathered[0].node = 0
rank 2: gathered[1].node = 1
rank 2: gathered[2].node = 2
rank 2: gathered[3].node = 3
rank 2: gathered[4].node = 4
rank 2: gathered[5].node = 5

Is that what you expected?

On Fri, Dec 9, 2011 at 12:03 PM, Brett Tully <brett.tully_at_[hidden]>wrote:

> Dear all,
>
> I have not used OpenMPI much before, but am maintaining a large legacy
> application. We noticed a bug to do with a call to MPI_Allgather as
> summarised in this post to Stackoverflow:
> http://stackoverflow.com/questions/8445398/mpi-allgather-produces-inconsistent-results
>
> In the process of looking further into the problem, I noticed that the
> following function results in strange behaviour.
>
> void test_all_gather() {
>
> struct _TEST_ALL_GATHER {
> int node;
> };
>
> int ierr, size, rank;
> ierr = MPI_Comm_size(MPI_COMM_WORLD, &size);
> ierr = MPI_Comm_rank(MPI_COMM_WORLD, &rank);
>
> struct _TEST_ALL_GATHER local;
> struct _TEST_ALL_GATHER *gathered;
>
> gathered = (struct _TEST_ALL_GATHER*) malloc(size * sizeof(*gathered));
>
> local.node = rank;
>
> MPI_Allgather(&local, sizeof(struct _TEST_ALL_GATHER), MPI_BYTE,
> gathered, sizeof(struct _TEST_ALL_GATHER), MPI_BYTE,
> MPI_COMM_WORLD);
>
> int i;
> for (i = 0; i < numnodes; ++i) {
> (void) printf("gathered[%d].node = %d\n", i, gathered[i].node);
> }
>
> FREE(gathered);
> }
>
> At one point, this function printed the following:
> gathered[0].node = 2
> gathered[1].node = 3
> gathered[2].node = 2
> gathered[3].node = 3
> gathered[4].node = 4
> gathered[5].node = 5
>
> Can anyone suggest a place to start looking into why this might be
> happening? There is a section of the code that calls MPI_Comm_split, but I
> am not sure if that is related...
>
> Running on Ubuntu 11.10 and a summary of ompi_info:
> Package: Open MPI buildd_at_allspice Distribution
> Open MPI: 1.4.3
> Open MPI SVN revision: r23834
> Open MPI release date: Oct 05, 2010
> Open RTE: 1.4.3
> Open RTE SVN revision: r23834
> Open RTE release date: Oct 05, 2010
> OPAL: 1.4.3
> OPAL SVN revision: r23834
> OPAL release date: Oct 05, 2010
> Ident string: 1.4.3
> Prefix: /usr
> Configured architecture: x86_64-pc-linux-gnu
> Configure host: allspice
> Configured by: buildd
>
> Thanks!
> Brett
>
> _______________________________________________
> users mailing list
> users_at_[hidden]
> http://www.open-mpi.org/mailman/listinfo.cgi/users
>

-- 
| Teng Ma          Univ. of Tennessee |
| tma_at_[hidden]        Knoxville, TN |
| http://web.eecs.utk.edu/~tma/       |