Open MPI logo

Open MPI Development Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Development mailing list

Subject: Re: [OMPI devel] mtt IBM reduce_scatter_in_place test failure
From: Jeff Squyres (jsquyres_at_[hidden])
Date: 2008-09-17 08:58:10


At first glance, the test looks ok.

Why do you think <= is incorrect? Is there a buffer length problem
somewhere?

I am able to reproduce the problem with 10 procs, though. But it runs
successfully at 8. Same results with both openib btl and tcp btl.

Can you file a ticket / dig a little deeper to see what's going wrong?

On Sep 16, 2008, at 1:00 PM, Lenny Verkhovsky wrote:

> I am running mtt test on our cluster and I found error for IBM
> reduce_scatter_in_place test for np>8
>
> /home/USERS/lenny/OMPI_1_3_TRUNK/bin/mpirun -np 10 -H witch2 ./
> reduce_scatter_in_place
>
> **WARNING**]: MPI_COMM_WORLD rank 4, file reduce_scatter_in_place.c:
> 80:
> bad answer (0) at index 0 of 1000 (should be 40000)
> [**WARNING**]: MPI_COMM_WORLD rank 3, file reduce_scatter_in_place.c:
> 80:
> [**WARNING**]: MPI_COMM_WORLD rank 2, file reduce_scatter_in_place.c:
> 80:
> bad answer (20916) at index 0 of 1000 (should be 20000)
> bad answer (0) at index 0 of 1000 (should be 30000)
> [**WARNING**]: MPI_COMM_WORLD rank 5, file reduce_scatter_in_place.c:
> 80:
> bad answer (0) at index 0 of 1000 (should be 50000)
> [**WARNING**]: MPI_COMM_WORLD rank 6, file reduce_scatter_in_place.c:
> 80:
> bad answer (0) at index 0 of 1000 (should be 60000)
> [**WARNING**]: MPI_COMM_WORLD rank 7, file reduce_scatter_in_place.c:
> 80:
> [**WARNING**]: MPI_COMM_WORLD rank 8, file reduce_scatter_in_place.c:
> 80:
> bad answer (0) at index 0 of 1000 (should be 80000)
> bad answer (0) at index 0 of 1000 (should be 70000)
> [**WARNING**]: MPI_COMM_WORLD rank 9, file reduce_scatter_in_place.c:
> 80:
> bad answer (0) at index 0 of 1000 (should be 90000)
> [**WARNING**]: MPI_COMM_WORLD rank 0, file reduce_scatter_in_place.c:
> 80:
> bad answer (-516024720) at index 0 of 1000 (should be 0)
> [**WARNING**]: MPI_COMM_WORLD rank 1, file reduce_scatter_in_place.c:
> 80:
> bad answer (28112) at index 0 of 1000 (should be 10000)
>
> I think that the error is in the test itself.
>
> --- sources/test_get__ibm/ibm/collective/reduce_scatter_in_place.c
> 2005-09-28 18:11:37.000000000 +0300
> +++ installs/LKcC/tests/ibm/ibm/collective/reduce_scatter_in_place.c
> 2008-09-16 19:32:48.000000000 +0300
> @@ -64,7 +64,7 @@ int main(int argc, char **argv)
> ompitest_error(__FILE__, __LINE__, "Doh! Rank %d was not able to
> allocate enough memory. MPI test aborted!\n", myself);
> }
>
> - for (j = 1; j <= MAXLEN; j *= 10) {
> + for (j = 1; j < MAXLEN; j *= 10) {
> for (i = 0; i < tasks; i++) {
> recvcounts[i] = j;
> }
>
>
> I am not sure if this is right fix and who can review/commit it to
> the test trunk.
>
>
> Best regards
>
> Lenny.
>
>
> _______________________________________________
> devel mailing list
> devel_at_[hidden]
> http://www.open-mpi.org/mailman/listinfo.cgi/devel

-- 
Jeff Squyres
Cisco Systems