Open MPI logo

Open MPI Development Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Development mailing list

Subject: Re: [OMPI devel] MPI datatype problem in mpi_test_suite?
From: Rainer Keller (keller_at_[hidden])
Date: 2009-07-06 19:43:48


Jeff,
On Monday 06 July 2009 11:05:16 am Jeff Squyres wrote:
> I notice that in the new HLRS mpi_test_suite, I'm getting oodles of
Well, the test suite is not really new (it was started some time around 2003)
Regular ompi testing is new ;-)) Thanks for that, Jeff!

> errors with the MPI_TYPE_MIX and MPI_SHORT_INT datatypes (Linux/
> x86_64). I have to run with:
>
> mpirun mpi_test_suite -d All,\!MPI_TYPE_MIX,\!MPI_SHORT_INT
>
> (which excludes these two types)
>
> I can't quite follow the test suite code, but MPI_TYPE_MIX is some
> kind of derived MPI datatype.
Yes. Basically MPI_TYPE_MIX (and MPI_TYPE_MIX_LB_UB) is a struct of 11 basic
types:
MPI_Datatype mix_type[11] = {MPI_CHAR, MPI_SHORT, MPI_INT, MPI_LONG,
                       MPI_FLOAT, MPI_DOUBLE, MPI_FLOAT_INT,
                       MPI_DOUBLE_INT, MPI_LONG_INT, MPI_SHORT_INT, MPI_2INT};

Now, as it contains MPI_SHORT_INT (which contains a hole), the problem's cause
may be similar!
This has to be investigated.

> Is something wrong with our datatype engine? Or are these tests
> faulty?
First of all, the MPI standard requires the types such as MPI_FLOAT_INT or
MPI_SHORT_INT to be usable in reduction operations.
Nevertheless they should be fine here.

Now, MPIch2-1.1 works fine with all the datatypes (including MPI_TYPE_MIX)
------------------------------
mpirun -np 2 ./mpi_test_suite -t 'P2P,Collective' -r FULL -x strict
P2P tests Ring (3/44), comm MPI_COMM_WORLD (1/13), type MPI_CHAR (1/29)
...
Collective tests Alltoall (47/44), comm Intracomm merged of the Halved
Intercomm (13/13), type MPI_TYPE_MIX_LB_UB (29/29)
Number of failed tests:0
------------------------------

> I don't know if anyone has run this test suite with any regularity before,
> so I don't know which it is...
Tests with these datatypes have been run on IBM's MPI, NEC's MPI (derived from
MPIch) and Intel MPI (well, also MPIch based) although these were tested some
time ago.
Tests against MPIch-1 and now MPIch2 have been done very often and bugs have
been tracked down, so I believe the core of the test suite itself is fine!

[I am not talking about correctness of individual tests themselves, e.g. -t
one-sided will definitely show bugs in the test-suite]...

With best regards,
Rainer

PS: The test suite fills the send buffers with known values according to the
datatype being passed to the test and afterwards checks against expected
values.
The send and recv buffers are preset with a definable pattern (0xa5) to check
for overwritten data in holes (see type MPI_TYPE_MIX_LB_UB).
The buffer starts with the MIN, then the MAX value of the given datatype,
followed by (2+rank_of_comm_partner), (3+rank...) etc.

One may check the hex-values of the ALL communicated buffers using a higher
report level (-r FULL), however, one may want to reduce the number of elements
send using -n, e.g. -n 10.
Higher values (default is -n 1000) however have shown problems (that have
hinted to bugs) when switching from eager protocol... These have been fixed in
ompi.

-- 
------------------------------------------------------------------------
Rainer Keller, PhD                  Tel: +1 (865) 241-6293
Oak Ridge National Lab          Fax: +1 (865) 241-4811
PO Box 2008 MS 6164           Email: keller_at_[hidden]
Oak Ridge, TN 37831-2008    AIM/Skype: rusraink