Open MPI logo

Open MPI User's Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Open MPI User's mailing list

Subject: Re: [OMPI users] MPI_Allgather problem
From: TERRY DONTJE (terry.dontje_at_[hidden])
Date: 2012-01-27 05:47:10


  ompi_info should tell you the current version of Open MPI your path is
pointing to.
Are you sure your path is pointing to the area that the OpenFOAM package
delivered Open MPI into?

--td
On 1/27/2012 5:02 AM, Brett Tully wrote:
> Interesting. In the same set of updates, I installed OpenFOAM from
> their Ubuntu deb package and it claims to ship with openmpi. I just
> downloaded their Third-party source tar and unzipped it to see what
> version of openmpi they are using, and it is 1.5.3. However, when I do
> man openmpi, or ompi_info, I get the same version as before (1.4.3).
> How do I determine for sure what is being included when I compile
> something using mpicc?
>
> Thanks,
> Brett.
>
>
> On Thu, Jan 26, 2012 at 10:05 PM, Jeff Squyres <jsquyres_at_[hidden]
> <mailto:jsquyres_at_[hidden]>> wrote:
>
> What version did you upgrade to? (we don't control the Ubuntu
> packaging)
>
> I see a bullet in the soon-to-be-released 1.4.5 release notes:
>
> - Fix obscure cases where MPI_ALLGATHER could crash. Thanks to Andrew
> Senin for reporting the problem.
>
> But that would be surprising if this is what fixed your issue,
> especially since it's not released yet. :-)
>
>
>
> On Jan 26, 2012, at 5:24 AM, Brett Tully wrote:
>
> > As of two days ago, this problem has disappeared and the tests
> that I had written and run each night are now passing. Having
> looked through the update log of my machine (Ubuntu 11.10) it
> appears as though I got a new version of mpi-default-dev
> (0.6ubuntu1). I would like to understand this problem in more
> detail -- is it possible to see what changed in this update?
> > Thanks,
> > Brett.
> >
> >
> >
> > On Fri, Dec 9, 2011 at 6:43 PM, teng ma <tma_at_[hidden]
> <mailto:tma_at_[hidden]>> wrote:
> > I guess your output is from different ranks. YOu can add rank
> infor inside print to tell like follows:
> >
> > (void) printf("rank %d: gathered[%d].node = %d\n", rank, i,
> gathered[i].node);
> >
> > From my side, I did not see anything wrong from your code in
> Open MPI 1.4.3. after I add rank, the output is
> > rank 5: gathered[0].node = 0
> > rank 5: gathered[1].node = 1
> > rank 5: gathered[2].node = 2
> > rank 5: gathered[3].node = 3
> > rank 5: gathered[4].node = 4
> > rank 5: gathered[5].node = 5
> > rank 3: gathered[0].node = 0
> > rank 3: gathered[1].node = 1
> > rank 3: gathered[2].node = 2
> > rank 3: gathered[3].node = 3
> > rank 3: gathered[4].node = 4
> > rank 3: gathered[5].node = 5
> > rank 1: gathered[0].node = 0
> > rank 1: gathered[1].node = 1
> > rank 1: gathered[2].node = 2
> > rank 1: gathered[3].node = 3
> > rank 1: gathered[4].node = 4
> > rank 1: gathered[5].node = 5
> > rank 0: gathered[0].node = 0
> > rank 0: gathered[1].node = 1
> > rank 0: gathered[2].node = 2
> > rank 0: gathered[3].node = 3
> > rank 0: gathered[4].node = 4
> > rank 0: gathered[5].node = 5
> > rank 4: gathered[0].node = 0
> > rank 4: gathered[1].node = 1
> > rank 4: gathered[2].node = 2
> > rank 4: gathered[3].node = 3
> > rank 4: gathered[4].node = 4
> > rank 4: gathered[5].node = 5
> > rank 2: gathered[0].node = 0
> > rank 2: gathered[1].node = 1
> > rank 2: gathered[2].node = 2
> > rank 2: gathered[3].node = 3
> > rank 2: gathered[4].node = 4
> > rank 2: gathered[5].node = 5
> >
> > Is that what you expected?
> >
> > On Fri, Dec 9, 2011 at 12:03 PM, Brett Tully
> <brett.tully_at_[hidden] <mailto:brett.tully_at_[hidden]>> wrote:
> > Dear all,
> >
> > I have not used OpenMPI much before, but am maintaining a large
> legacy application. We noticed a bug to do with a call to
> MPI_Allgather as summarised in this post to Stackoverflow:
> http://stackoverflow.com/questions/8445398/mpi-allgather-produces-inconsistent-results
> >
> > In the process of looking further into the problem, I noticed
> that the following function results in strange behaviour.
> >
> > void test_all_gather() {
> >
> > struct _TEST_ALL_GATHER {
> > int node;
> > };
> >
> > int ierr, size, rank;
> > ierr = MPI_Comm_size(MPI_COMM_WORLD, &size);
> > ierr = MPI_Comm_rank(MPI_COMM_WORLD, &rank);
> >
> > struct _TEST_ALL_GATHER local;
> > struct _TEST_ALL_GATHER *gathered;
> >
> > gathered = (struct _TEST_ALL_GATHER*) malloc(size *
> sizeof(*gathered));
> >
> > local.node = rank;
> >
> > MPI_Allgather(&local, sizeof(struct _TEST_ALL_GATHER), MPI_BYTE,
> > gathered, sizeof(struct _TEST_ALL_GATHER), MPI_BYTE,
> MPI_COMM_WORLD);
> >
> > int i;
> > for (i = 0; i < numnodes; ++i) {
> > (void) printf("gathered[%d].node = %d\n", i,
> gathered[i].node);
> > }
> >
> > FREE(gathered);
> > }
> >
> > At one point, this function printed the following:
> > gathered[0].node = 2
> > gathered[1].node = 3
> > gathered[2].node = 2
> > gathered[3].node = 3
> > gathered[4].node = 4
> > gathered[5].node = 5
> >
> > Can anyone suggest a place to start looking into why this might
> be happening? There is a section of the code that calls
> MPI_Comm_split, but I am not sure if that is related...
> >
> > Running on Ubuntu 11.10 and a summary of ompi_info:
> > Package: Open MPI buildd_at_allspice Distribution
> > Open MPI: 1.4.3
> > Open MPI SVN revision: r23834
> > Open MPI release date: Oct 05, 2010
> > Open RTE: 1.4.3
> > Open RTE SVN revision: r23834
> > Open RTE release date: Oct 05, 2010
> > OPAL: 1.4.3
> > OPAL SVN revision: r23834
> > OPAL release date: Oct 05, 2010
> > Ident string: 1.4.3
> > Prefix: /usr
> > Configured architecture: x86_64-pc-linux-gnu
> > Configure host: allspice
> > Configured by: buildd
> >
> > Thanks!
> > Brett
> >
> > _______________________________________________
> > users mailing list
> > users_at_[hidden] <mailto:users_at_[hidden]>
> > http://www.open-mpi.org/mailman/listinfo.cgi/users
> >
> >
> >
> > --
> > | Teng Ma Univ. of Tennessee |
> > | tma_at_[hidden] <mailto:tma_at_[hidden]> Knoxville, TN |
> > | http://web.eecs.utk.edu/~tma/
> <http://web.eecs.utk.edu/%7Etma/> |
> >
> > _______________________________________________
> > users mailing list
> > users_at_[hidden] <mailto:users_at_[hidden]>
> > http://www.open-mpi.org/mailman/listinfo.cgi/users
> >
> >
> > _______________________________________________
> > users mailing list
> > users_at_[hidden] <mailto:users_at_[hidden]>
> > http://www.open-mpi.org/mailman/listinfo.cgi/users
>
>
> --
> Jeff Squyres
> jsquyres_at_[hidden] <mailto:jsquyres_at_[hidden]>
> For corporate legal information go to:
> http://www.cisco.com/web/about/doing_business/legal/cri/
>
>
> _______________________________________________
> users mailing list
> users_at_[hidden] <mailto:users_at_[hidden]>
> http://www.open-mpi.org/mailman/listinfo.cgi/users
>
>
>
>
> _______________________________________________
> users mailing list
> users_at_[hidden]
> http://www.open-mpi.org/mailman/listinfo.cgi/users

-- 
Terry D. Dontje | Principal Software Engineer
Developer Tools Engineering | +1.781.442.2631
Oracle *- Performance Technologies*
95 Network Drive, Burlington, MA 01803
Email terry.dontje_at_[hidden] <mailto:terry.dontje_at_[hidden]>