Subject: [MTT users] mtt IBM reduce_scatter_in_place test failure
From: Lenny Verkhovsky (lenny.verkhovsky_at_[hidden])
Date: 2008-09-17 01:51:56


I am running mtt test on our cluster and I found error for
IBM reduce_scatter_in_place test for np>8

/home/USERS/lenny/OMPI_1_3_TRUNK/bin/mpirun -np 10 -H witch2
./reduce_scatter_in_place

**WARNING**]: MPI_COMM_WORLD rank 4, file reduce_scatter_in_place.c:80:
bad answer (0) at index 0 of 1000 (should be 40000)
[**WARNING**]: MPI_COMM_WORLD rank 3, file reduce_scatter_in_place.c:80:
[**WARNING**]: MPI_COMM_WORLD rank 2, file reduce_scatter_in_place.c:80:
bad answer (20916) at index 0 of 1000 (should be 20000)
bad answer (0) at index 0 of 1000 (should be 30000)
[**WARNING**]: MPI_COMM_WORLD rank 5, file reduce_scatter_in_place.c:80:
bad answer (0) at index 0 of 1000 (should be 50000)
[**WARNING**]: MPI_COMM_WORLD rank 6, file reduce_scatter_in_place.c:80:
bad answer (0) at index 0 of 1000 (should be 60000)
[**WARNING**]: MPI_COMM_WORLD rank 7, file reduce_scatter_in_place.c:80:
[**WARNING**]: MPI_COMM_WORLD rank 8, file reduce_scatter_in_place.c:80:
bad answer (0) at index 0 of 1000 (should be 80000)
bad answer (0) at index 0 of 1000 (should be 70000)
[**WARNING**]: MPI_COMM_WORLD rank 9, file reduce_scatter_in_place.c:80:
bad answer (0) at index 0 of 1000 (should be 90000)
[**WARNING**]: MPI_COMM_WORLD rank 0, file reduce_scatter_in_place.c:80:
bad answer (-516024720) at index 0 of 1000 (should be 0)
[**WARNING**]: MPI_COMM_WORLD rank 1, file reduce_scatter_in_place.c:80:
bad answer (28112) at index 0 of 1000 (should be 10000)

I think that the error is in the test itself.

--- sources/test_get__ibm/ibm/collective/reduce_scatter_in_place.c
2005-09-28 18:11:37.000000000 +0300
+++ installs/LKcC/tests/ibm/ibm/collective/reduce_scatter_in_place.c
2008-09-16 19:32:48.000000000 +0300
@@ -64,7 +64,7 @@ int main(int argc, char **argv)
  ompitest_error(__FILE__, __LINE__, "Doh! Rank %d was not able to allocate
enough memory. MPI test aborted!\n", myself);
  }

- for (j = 1; j <= MAXLEN; j *= 10) {
+ for (j = 1; j < MAXLEN; j *= 10) {
  for (i = 0; i < tasks; i++) {
  recvcounts[i] = j;
  }

I am not sure if this is right fix and who can review/commit it to the test
trunk.

Best regards

Lenny.