Simone Pellegrini wrote:
> sorry for the delay but I did some additional experiments to found out
> whether the problem was openmpi or gcc!
> The program just hangs... and never terminates! I am running on a SMP
> machine with 32 cores, actually it is a Sun Fire X4600 X2. (8
> quad-core Barcelona AMD chips), the OS is CentOS 5 and the kernel is
> 2.6.18-92.el5.src-PAPI (patched with PAPI).
> I use a N of 1024, and if I print out the value of the iterator i,
> sometimes it stops around 165, other times around 520... and it
> doesn't make any sense.
> If I run the program (and it's important to notice I don't recompile
> it, I just use another mpirun from a different mpi version) the
> program works fine. I did some experiments during the weekend and if I
> use openmpi-1.3.2 compiled with gcc433 everything works fine.
> So I really think the problem is strictly related to the usage of
> gcc-4.4.0! ...and it doesn't depends from OpenMPI as the program hangs
> even when I use gcc 1.3.1 compiled with gcc 4.4!
I finally got GCC 4.4, but was unable to reproduce the problem. How
small can you make np (number of MPI processes) and still see the
problem? How reproducible is the problem? When it hangs, can you get
stack traces of all the processes? We're trying to hunt down some
similar behavior, but I think yours is of a different flavor.