On Nov 23, 2013, at 01:18 , Pierre Jolivet <jolivet@ann.jussieu.fr> wrote:

George,

On Nov 22, 2013, at 5:21 AM, George Bosilca <bosilca@icl.utk.edu> wrote:

Pierre,

On Nov 22, 2013, at 02:39 , Pierre Jolivet <jolivet@ann.jussieu.fr> wrote:

George,
I completely agree that the code I sent was a good example of what NOT to do with collective and non-blocking communications, so Iím sending a better one.
1. Iím setting MPI_DATATYPE_NULL only on non-root processes. The root has a real datatype. Why should both match when using MPI_IN_PLACE ?

Because it is a strong requirement of the MPI standard: the typemap of a send should be matched by its corresponding receive. Otherwise, it is legal to raise an exception of type MPI_ERR_TYPE.

2-3-4. Yes, all these points are valid, this is of course just a minimalist example.

My question is, if you are indeed saying that it is not a OpenMPI bug, what is the rationale for changing the behavior between MPI_Scatter and MPI_Iscatter when it comes down to the send type on non-root processes.

Different algorithms implemented by different people. Some of them are more robust, while others less. In this case Scatter translate the count = 0 to a message length = 0, while the Iscatter always look for the extent of the datatype.

I donít see any remark on that matter on the MPI 3.0 documentation.

Indeed, and there is at least one example where MPI_DATATYPE_NULL is explicitly used for calls where the datatype does not matter (4.23 as an example). Horrible!!!

Thatís what I donít get, why are you saying itís horrible ? It is clearly written in the spec. that the data type is only significant at root (for Scatter), and that Iscatter is nothing else than a nonblocking variant of Scatter (so the value should also be significant only at root).

Because this is mixing two concepts: non-existence and non-significance. 

Moreover, there is at least one thing that is wrong in the sources:
1) in ompi/mca/coll/libnbc/nbc_igather.c, line 55 should read:
  if (MPI_SUCCESS != res) { printf("MPI Error in MPI_Comm_size() (%i)\n", res); return res; }
instead of:
  if (MPI_SUCCESS != res) { printf("MPI Error in MPI_Comm_rank() (%i)\n", res); return res; }

And I still have a hard time believing that the test line 56 in ompi/mca/coll/libnbc/nbc_igather.c ó if (rank == root) ó is not missing in ompi/mca/coll/libnbc/nbc_iscatter.c line 58, but I guess I will have to trust you on this one.

A patch has been submitted (r29736). Thanks for the bug report.

  George.



You should probably specify somewhere that you differ from the standard for that function, other MPI implementations donít have this limitation, c.f.http://pic.dhe.ibm.com/infocenter/zos/v1r12/index.jsp?topic=%2Fcom.ibm.zos.r12.fomp200%2Fipezps0025.htm or http://trac.mpich.org/projects/mpich/browser/src/mpi/coll/iscatter.c#L601

Pierre

 George.


Thanks.

#include <mpi.h>

int main(int argc, char** argv) {
  int          taskid, ntasks;
  MPI_Init(&argc, &argv);
  MPI_Request rq;

  MPI_Comm_rank(MPI_COMM_WORLD,&taskid);
  MPI_Comm_size(MPI_COMM_WORLD,&ntasks);
  double* r;
  int l = 0;
  // This will run fine. MPI_DOUBLE
  if(taskid > 0)
      MPI_Iscatter(NULL, 0, MPI_DOUBLE, r, l, MPI_DOUBLE, 0, MPI_COMM_WORLD, &rq);
  else
      MPI_Iscatter(r, l, MPI_DOUBLE, MPI_IN_PLACE, 0, MPI_DATATYPE_NULL, 0, MPI_COMM_WORLD, &rq);
  MPI_Wait(&rq, MPI_STATUS_IGNORE);
  // This will run fine. MPI_DATATYPE_NULL
  if(taskid > 0)
      MPI_Scatter(NULL, 0, MPI_DATATYPE_NULL, r, l, MPI_DOUBLE, 0, MPI_COMM_WORLD);
  else
      MPI_Scatter(r, l, MPI_DOUBLE, MPI_IN_PLACE, 0, MPI_DATATYPE_NULL, 0, MPI_COMM_WORLD);
  // This will not run fine. MPI_DATATYPE_NULL
  if(taskid > 0)
      MPI_Iscatter(NULL, 0, MPI_DATATYPE_NULL, r, l, MPI_DOUBLE, 0, MPI_COMM_WORLD, &rq);
  else
      MPI_Iscatter(r, l, MPI_DOUBLE, MPI_IN_PLACE, 0, MPI_DATATYPE_NULL, 0, MPI_COMM_WORLD, &rq);
  MPI_Wait(&rq, MPI_STATUS_IGNORE);
  MPI_Finalize();
}

On Nov 21, 2013, at 4:34 PM, George Bosilca <bosilca@icl.utk.edu> wrote:

Pierre,
There are several issues with the code you provided.

1. You canít use an MPI_DATATYPE_NULL as the send datatype, not even when count is zero. At least the root must provide a real datatype. In fact the type signature of the send message (datatype and count) should match the type signature of the receiving datatype.

2. I know your count is zero, and no data will be transmitted but your code is difficult to read and understand.

3. MPI_Iscatter is a collective communication. As such all processes in the associated communicator (MPI_COMM_WORLD in your case) must participate to the collective. Thus, calling MPI_Iscatter only where tasked > 0 is incorrect (you explicitly excluded 0).

4. From the MPI standard perspective your example is not correct, as you are not allowed to call MPI_Finalize while there are messages pending. Now, Open MPI tolerate this, but it is clearly not standard behavior.

#include <mpi.h>

int main(int argc, char** argv)
{
 int          taskid, ntasks;
 MPI_Init(&argc, &argv);
 MPI_Request rq;
 MPI_Comm_rank(MPI_COMM_WORLD,&taskid);
 MPI_Comm_size(MPI_COMM_WORLD,&ntasks);
 double r;
 int l = 0;

 MPI_Iscatter(NULL, 0, MPI_DOUBLE, &r, l, MPI_DOUBLE, 0, MPI_COMM_WORLD, &rq);
 MPI_Wait(&rq, MPI_STATUS_IGNORE);

 MPI_Finalize();
}

George.


On Nov 21, 2013, at 23:19 , Pierre Jolivet <jolivet@ann.jussieu.fr> wrote:

Hello,
The following code doesnít execute properly :
#include <mpi.h>

int main(int argc, char** argv) {
int          taskid, ntasks;
MPI_Init(&argc, &argv);
MPI_Request rq;

MPI_Comm_rank(MPI_COMM_WORLD,&taskid);
MPI_Comm_size(MPI_COMM_WORLD,&ntasks);
double* r;
int l = 0;
if(taskid > 0)
    MPI_Iscatter(NULL, 0, MPI_DATATYPE_NULL, r, l, MPI_DOUBLE, 0, MPI_COMM_WORLD, &rq);
MPI_Finalize();
}

Outcome:
*** An error occurred in MPI_Type_extent
*** MPI_ERR_TYPE: invalid datatype

Hotfix: change MPI_DATATYPE_NULL to something else.

Thanks for a quick fix.
_______________________________________________
users mailing list
users@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users

_______________________________________________
users mailing list
users@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users

_______________________________________________
users mailing list
users@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users

_______________________________________________
users mailing list
users@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users

_______________________________________________
users mailing list
users@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users