George,

On Nov 22, 2013, at 5:21 AM, George Bosilca <bosilca@icl.utk.edu> wrote:

Pierre,

On Nov 22, 2013, at 02:39 , Pierre Jolivet <jolivet@ann.jussieu.fr> wrote:

George,
I completely agree that the code I sent was a good example of what NOT to do with collective and non-blocking communications, so Iím sending a better one.
1. Iím setting MPI_DATATYPE_NULL only on non-root processes. The root has a real datatype. Why should both match when using MPI_IN_PLACE ?

Because it is a strong requirement of the MPI standard: the typemap of a send should be matched by its corresponding receive. Otherwise, it is legal to raise an exception of type MPI_ERR_TYPE.

2-3-4. Yes, all these points are valid, this is of course just a minimalist example.

My question is, if you are indeed saying that it is not a OpenMPI bug, what is the rationale for changing the behavior between MPI_Scatter and MPI_Iscatter when it comes down to the send type on non-root processes.

Different algorithms implemented by different people. Some of them are more robust, while others less. In this case Scatter translate the count = 0 to a message length = 0, while the Iscatter always look for the extent of the datatype.

I donít see any remark on that matter on the MPI 3.0 documentation.

Indeed, and there is at least one example where MPI_DATATYPE_NULL is explicitly used for calls where the datatype does not matter (4.23 as an example). Horrible!!!

Thatís what I donít get, why are you saying itís horrible ? It is clearly written in the spec. that the data type is only significant at root (for Scatter), and that Iscatter is nothing else than a nonblocking variant of Scatter (so the value should also be significant only at root).

Moreover, there is at least one thing that is wrong in the sources:
1) in ompi/mca/coll/libnbc/nbc_igather.c, line 55 should read:
  if (MPI_SUCCESS != res) { printf("MPI Error in MPI_Comm_size() (%i)\n", res); return res; }
instead of:
  if (MPI_SUCCESS != res) { printf("MPI Error in MPI_Comm_rank() (%i)\n", res); return res; }

And I still have a hard time believing that the test line 56 in ompi/mca/coll/libnbc/nbc_igather.c ó if (rank == root) ó is not missing in ompi/mca/coll/libnbc/nbc_iscatter.c line 58, but I guess I will have to trust you on this one.

You should probably specify somewhere that you differ from the standard for that function, other MPI implementations donít have this limitation, c.f.http://pic.dhe.ibm.com/infocenter/zos/v1r12/index.jsp?topic=%2Fcom.ibm.zos.r12.fomp200%2Fipezps0025.htm or http://trac.mpich.org/projects/mpich/browser/src/mpi/coll/iscatter.c#L601

Pierre

 George.


Thanks.

#include <mpi.h>

int main(int argc, char** argv) {
  int          taskid, ntasks;
  MPI_Init(&argc, &argv);
  MPI_Request rq;

  MPI_Comm_rank(MPI_COMM_WORLD,&taskid);
  MPI_Comm_size(MPI_COMM_WORLD,&ntasks);
  double* r;
  int l = 0;
  // This will run fine. MPI_DOUBLE
  if(taskid > 0)
      MPI_Iscatter(NULL, 0, MPI_DOUBLE, r, l, MPI_DOUBLE, 0, MPI_COMM_WORLD, &rq);
  else
      MPI_Iscatter(r, l, MPI_DOUBLE, MPI_IN_PLACE, 0, MPI_DATATYPE_NULL, 0, MPI_COMM_WORLD, &rq);
  MPI_Wait(&rq, MPI_STATUS_IGNORE);
  // This will run fine. MPI_DATATYPE_NULL
  if(taskid > 0)
      MPI_Scatter(NULL, 0, MPI_DATATYPE_NULL, r, l, MPI_DOUBLE, 0, MPI_COMM_WORLD);
  else
      MPI_Scatter(r, l, MPI_DOUBLE, MPI_IN_PLACE, 0, MPI_DATATYPE_NULL, 0, MPI_COMM_WORLD);
  // This will not run fine. MPI_DATATYPE_NULL
  if(taskid > 0)
      MPI_Iscatter(NULL, 0, MPI_DATATYPE_NULL, r, l, MPI_DOUBLE, 0, MPI_COMM_WORLD, &rq);
  else
      MPI_Iscatter(r, l, MPI_DOUBLE, MPI_IN_PLACE, 0, MPI_DATATYPE_NULL, 0, MPI_COMM_WORLD, &rq);
  MPI_Wait(&rq, MPI_STATUS_IGNORE);
  MPI_Finalize();
}

On Nov 21, 2013, at 4:34 PM, George Bosilca <bosilca@icl.utk.edu> wrote:

Pierre,
There are several issues with the code you provided.

1. You canít use an MPI_DATATYPE_NULL as the send datatype, not even when count is zero. At least the root must provide a real datatype. In fact the type signature of the send message (datatype and count) should match the type signature of the receiving datatype.

2. I know your count is zero, and no data will be transmitted but your code is difficult to read and understand.

3. MPI_Iscatter is a collective communication. As such all processes in the associated communicator (MPI_COMM_WORLD in your case) must participate to the collective. Thus, calling MPI_Iscatter only where tasked > 0 is incorrect (you explicitly excluded 0).

4. From the MPI standard perspective your example is not correct, as you are not allowed to call MPI_Finalize while there are messages pending. Now, Open MPI tolerate this, but it is clearly not standard behavior.

#include <mpi.h>

int main(int argc, char** argv)
{
 int          taskid, ntasks;
 MPI_Init(&argc, &argv);
 MPI_Request rq;
 MPI_Comm_rank(MPI_COMM_WORLD,&taskid);
 MPI_Comm_size(MPI_COMM_WORLD,&ntasks);
 double r;
 int l = 0;

 MPI_Iscatter(NULL, 0, MPI_DOUBLE, &r, l, MPI_DOUBLE, 0, MPI_COMM_WORLD, &rq);
 MPI_Wait(&rq, MPI_STATUS_IGNORE);

 MPI_Finalize();
}

George.


On Nov 21, 2013, at 23:19 , Pierre Jolivet <jolivet@ann.jussieu.fr> wrote:

Hello,
The following code doesnít execute properly :
#include <mpi.h>

int main(int argc, char** argv) {
int          taskid, ntasks;
MPI_Init(&argc, &argv);
MPI_Request rq;

MPI_Comm_rank(MPI_COMM_WORLD,&taskid);
MPI_Comm_size(MPI_COMM_WORLD,&ntasks);
double* r;
int l = 0;
if(taskid > 0)
    MPI_Iscatter(NULL, 0, MPI_DATATYPE_NULL, r, l, MPI_DOUBLE, 0, MPI_COMM_WORLD, &rq);
MPI_Finalize();
}

Outcome:
*** An error occurred in MPI_Type_extent
*** MPI_ERR_TYPE: invalid datatype

Hotfix: change MPI_DATATYPE_NULL to something else.

Thanks for a quick fix.
_______________________________________________
users mailing list
users@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users

_______________________________________________
users mailing list
users@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users

_______________________________________________
users mailing list
users@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users

_______________________________________________
users mailing list
users@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users