Yes, your output is what I was expecting. Actually, your output is what I get if I compile the code I attached in my first email. However, our application is actually doing some 'smart' stuff when you dynamically allocate memory by putting headers around the memory block -- I am guessing that this can interfere with MPI_Allgather(). What is strange is that this problem doesn't surface on the other machine that we are working with (OpenSUSE) nor does it appear if we run it with valgrind. This is probably a dumb question, but if you were to see this problem, where is the first place your gut would tell you to look?
I guess your output is from different ranks. YOu can add rank infor inside print to tell like follows:
(void) printf("rank %d: gathered[%d].node = %d\n", rank, i, gathered[i].node);
From my side, I did not see anything wrong from your code in Open MPI 1.4.3. after I add rank, the output is
rank 5: gathered[0].node = 0
rank 5: gathered[1].node = 1
rank 5: gathered[2].node = 2
rank 5: gathered[3].node = 3
rank 5: gathered[4].node = 4
rank 5: gathered[5].node = 5
rank 3: gathered[0].node = 0
rank 3: gathered[1].node = 1
rank 3: gathered[2].node = 2
rank 3: gathered[3].node = 3
rank 3: gathered[4].node = 4
rank 3: gathered[5].node = 5
rank 1: gathered[0].node = 0
rank 1: gathered[1].node = 1
rank 1: gathered[2].node = 2
rank 1: gathered[3].node = 3
rank 1: gathered[4].node = 4
rank 1: gathered[5].node = 5
rank 0: gathered[0].node = 0
rank 0: gathered[1].node = 1
rank 0: gathered[2].node = 2
rank 0: gathered[3].node = 3
rank 0: gathered[4].node = 4
rank 0: gathered[5].node = 5
rank 4: gathered[0].node = 0
rank 4: gathered[1].node = 1
rank 4: gathered[2].node = 2
rank 4: gathered[3].node = 3
rank 4: gathered[4].node = 4
rank 4: gathered[5].node = 5
rank 2: gathered[0].node = 0
rank 2: gathered[1].node = 1
rank 2: gathered[2].node = 2
rank 2: gathered[3].node = 3
rank 2: gathered[4].node = 4
rank 2: gathered[5].node = 5
Is that what you expected?On Fri, Dec 9, 2011 at 12:03 PM, Brett Tully <brett.tully@oxyntix.com> wrote:
_______________________________________________Dear all,I have not used OpenMPI much before, but am maintaining a large legacy application. We noticed a bug to do with a call to MPI_Allgather as summarised in this post to Stackoverflow: http://stackoverflow.com/questions/8445398/mpi-allgather-produces-inconsistent-resultsIn the process of looking further into the problem, I noticed that the following function results in strange behaviour.void test_all_gather() {
struct _TEST_ALL_GATHER {
int node;
};
int ierr, size, rank;
ierr = MPI_Comm_size(MPI_COMM_WORLD, &size);
ierr = MPI_Comm_rank(MPI_COMM_WORLD, &rank);
struct _TEST_ALL_GATHER local;
struct _TEST_ALL_GATHER *gathered;
gathered = (struct _TEST_ALL_GATHER*) malloc(size * sizeof(*gathered));
local.node = rank;
MPI_Allgather(&local, sizeof(struct _TEST_ALL_GATHER), MPI_BYTE,gathered, sizeof(struct _TEST_ALL_GATHER), MPI_BYTE, MPI_COMM_WORLD);
int i;
for (i = 0; i < numnodes; ++i) {
(void) printf("gathered[%d].node = %d\n", i, gathered[i].node);
}
FREE(gathered);
}At one point, this function printed the following:gathered[0].node = 2gathered[1].node = 3gathered[2].node = 2gathered[3].node = 3gathered[4].node = 4gathered[5].node = 5Can anyone suggest a place to start looking into why this might be happening? There is a section of the code that calls MPI_Comm_split, but I am not sure if that is related...Running on Ubuntu 11.10 and a summary of ompi_info:Package: Open MPI buildd@allspice DistributionOpen MPI: 1.4.3Open MPI SVN revision: r23834Open MPI release date: Oct 05, 2010Open RTE: 1.4.3Open RTE SVN revision: r23834Open RTE release date: Oct 05, 2010OPAL: 1.4.3OPAL SVN revision: r23834OPAL release date: Oct 05, 2010Ident string: 1.4.3Prefix: /usrConfigured architecture: x86_64-pc-linux-gnuConfigure host: allspiceConfigured by: builddThanks!Brett
users mailing list
users@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users
--
| Teng Ma Univ. of Tennessee |
| tma@cs.utk.edu Knoxville, TN |
| http://web.eecs.utk.edu/~tma/ |
_______________________________________________
users mailing list
users@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users