Open MPI logo

Open MPI User's Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Open MPI User's mailing list

Subject: Re: [OMPI users] collective communications broken on more than 4 cores
From: Vincent Loechner (loechner_at_[hidden])
Date: 2009-10-29 10:21:13


> > It seems that the calls to collective communication are not
> > returning for some MPI processes, when the number of processes is
> > greater or equal to 5. It's reproduceable, on two different
> > architectures, with two different versions of OpenMPI (1.3.2 and
> > 1.3.3). It was working correctly with OpenMPI version 1.2.7.
>
> Does it work if you turn off the shared memory transport layer; that is,
>
> mpirun -n 6 -mca btl ^sm ./testmpi

Yes it does, on both my configurations (AMD and Intel processor).
So it seems that the shared memory synchronization process is
broken.

Could be a system bug, I don't know what library OpenMPI uses
(is it IPC ?). Both my systems are Linux 2.6.31, the AMD is Ubuntu,
and the Intel is an ARCH-linux.

--Vincent