Open MPI logo

Open MPI User's Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Open MPI User's mailing list

Subject: [OMPI users] hangs of MPI_WIN_LOCK/UNLOCK (gfortran)
From: eatdirt (dirteat_at_[hidden])
Date: 2012-08-16 14:35:57


Hi there,
I have attached a little piece of code which summarizes a "bug?" that
annoys me ultimately. Issuing various calls to MPI_WIN_LOCK/UNLOCK seems
to hang some processes until a MPI_BARRIER is encountered!??

My experience with MPI is very modest, so I apologize in advance if I
misread the MPI-2 specs, but it looks that what I want to do is correct.

If you look to the file hangs.F90; the code starts with various call to
LOCK/UNLOCK and then goes on with, let's say, a big piece of work, in
between the comment " start action" and "action done". For the purpose
of this example, that's a do loop of 10s.

I don't want to put a barrier after the various calls to LOCK/UNLOCK
because I want it to run asynchronously. Also notice that I don't need
some mutex or so, all that calls can be done simultaneously and in any
order. My only pb is the following hangs:

Here the output when the code run on a SMP machine (8 cores) by
increasing the number of processus (the same occurs with distributed
memory).

mpirun -np 1 ./hangs
start action for rank= 0
(10 secondes later)
action done for rank= 0

<----works as I expect.

mpirun -np 2 ./hangs
start action for rank= 1
start action for rank= 0
(10 secs later)
action done for rank= 1
action done for rank= 0

<----so far so good; but with more processus the "bug?" appears:

mpirun -np 3 ./hangs
start action for rank= 1
start action for rank= 0
(10 secs later)
action done for rank= 0
action done for rank= 1
start action for rank= 2
(10 secs later)
action done for rank= 2

The processus 2 remained stuck on the MPI_UNLOCK statement until 0 and 1
reached the MPI_BARRIER instruction; which actually renders the
execution serial :)

I tested with up to 8 processes and the problem becomes even worse; a
random number of processes are stuck on the MPI_UNLOCK. However, this
does not occur at each execution. Sometime, rarely though, all the
processes get released as expected from the UNLOCK.

Additionally, if a MPI_BARRIER is issued just after the MPI_UNLOCK,
there is no problem any more; but I never read in the MPI-2 specs that
it should be the case, and this would completely kills the interest of
performing asynchronous operations.

gcc/gfortran is 4.6.3
(Open MPI) 1.4.5

Please let me know if this behaviour can be fixed and if you need
additional information!

Thanks in advance,
Cheers,
Chris.