Open MPI logo

Open MPI User's Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Open MPI User's mailing list

Subject: Re: [OMPI users] possible bug exercised by mpi4py
From: George Bosilca (bosilca_at_[hidden])
Date: 2012-05-24 09:51:00

This bug had the opportunity to appear in all collectives supporting intercommunicators where we check the receive buffer(s) consistency. In addition to what Jeff fixed already, I fix it in ALLTOALLV, ALLTOALLW and GATHER.


On May 24, 2012, at 09:37 , Jeff Squyres wrote:

> On May 24, 2012, at 9:28 AM, Jeff Squyres wrote:
>> So I checked them all, and I found SCATTERV, GATHERV, and REDUCE_SCATTER all had the issue. Now fixed on the trunk, and will be in 1.6.1.
> I forgot to mention -- this issue exists waaay back in the Open MPI code base. I spot-checked Open MPI 1.2.0 and see it there, too.
> To be clear: this particular bug only shows itself when you invoke ALLGATHERV, GATHERV, SCATTERV, or REDUCE_SCATTER on an intercommunicator where the sizes of the two groups are unequal. Whether the problem shows itself or not is rather random (i.e., it depends on how "safe" the memory is after the recvcounts array). FWIW, you can workaround this bug by setting the MCA parameter "mpi_param_check" to 0, which disables all MPI function parameter checking. That may not be attractive in some cases, of course.
> More specifically: since this problem has been in the OMPI code base for *years* (possibly since 1.0 -- but I'm not going to bother to check), it shows how little real-world applications actually use this specific functionality. Don't get me wrong -- I'm *very* thankful to the mpi4py community for raising this issue, and I'm glad to get it fixed! But it does show that there are dark, dusty corners in MPI functionality where few bother to tread. :-)
> --
> Jeff Squyres
> jsquyres_at_[hidden]
> For corporate legal information go to:
> _______________________________________________
> users mailing list
> users_at_[hidden]