Open MPI logo

Open MPI Development Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Development mailing list

Subject: [OMPI devel] Possible bug: rdma OSC does not progress RMA operations
From: Iliev, Hristo (Iliev_at_[hidden])
Date: 2013-09-12 13:01:53


Hi,

 

It looks like the rmda OSC component does not progress passive RMA
operations at the target during calls to MPI_WIN_(UN)LOCK. As a sample case
take a master-worker program where each worker writes to an entry in an
array exposed in the master's window:

 

MPI_Comm_size(MPI_COMM_WORLD, &size);

MPI_Comm_rank(MPI_COMM_WORLD, &rank);

 

If (rank == 0)

{

   // Master code

   MPI_Alloc_mem(size * sizeof(int), MPI_INFO_NULL, &array);

   MPI_Win_create(array, size * sizeof(int), sizeof(int), MPI_INFO_NULL,
MPI_COMM_WORLD, &win);

   do

   {

      MPI_Win_lock(MPI_LOCK_EXCLUSIVE, 0, 0, win);

      nonzeros = count non-zero elements of array

      MPI_Win_unlock(0, win);

   } while(nonzeros < size-1);

   MPI_Win_free(&win);

   MPI_Free_mem(array);

}

else

{

   // Worker code

   int one = 1;

   MPI_Win_create(NULL, 0, 1, MPI_INFO_NULL, MPI_COMM_WORLD, &win);

   // Postpone the RMA with a rank-specific time

   sleep(rank);

   MPI_Win_lock(MPI_LOCK_EXCLUSIVE, 0, 0, win);

   MPI_Put(&one, 1, MPI_INT, 0, rank, 1, MPI_INT, win);

   MPI_Win_unlock(0, win);

   MPI_Win_free(&win);

}

 

Attached is a complete sample program. The program hangs when run with the
default MCA settings:

 

$ mpirun -n 3 ./rma.x

[1379003818.571960] 0 workers checked in

[1379003819.571317] Worker 1 acquired lock

[1379003819.571374] Worker 1 unlocking the window

[1379003820.571342] Worker 2 acquired lock

[1379003820.571384] Worker 2 unlocking the window

<hangs>

On the other hand, it works as expected if pt2pt is forced:

 

$ mpirun --mca osc pt2pt -n 3 ./rma.x | sort

[1379003926.000442] 0 workers checked in

[1379003926.998981] Worker 1 acquired lock

[1379003926.999027] Worker 1 unlocking the window

[1379003926.999076] Worker 1 synched

[1379003926.999078] 1 workers checked in

[1379003927.998917] Worker 2 acquired lock

[1379003927.998940] Worker 2 unlocking the window

[1379003927.998962] Worker 2 synched

[1379003927.998964] 2 workers checked in

[1379003927.998973] All workers checked in

[1379003927.998996] Worker 1 done

[1379003927.998996] Worker 2 done

[1379003927.999099] Master finished

 

All processes are started on the same host. Open MPI is 1.6.4 without
progression thread. The output from ompi_info is attached. The same
behaviour (hang with rdma, success with pt2pt) is observed when the tcp BTL
is used and when all processes run on separate cluster nodes and talk via
the openib BTL.

 

Is this a bug in the rdma OSC component or does the sample program violate
the MPI correctness requirements for RMA operations?

 

Kind regards,

Hristo

 

--
Hristo Iliev, PhD - High Performance Computing Team
RWTH Aachen University, Center for Computing and Communication
Rechen- und Kommunikationszentrum der RWTH Aachen
Seffenter Weg 23, D 52074 Aachen (Germany)






  • text/plain attachment: rma.c

  • application/pkcs7-signature attachment: smime.p7s