In trying to build 1.2.6 with the pgi compilers it makes an MPI
library that works with tcp, sm. But it segfaults on openib.
Both our intel compiler version and pgi version of 1.2.6 blow up like
this when we force IB. So this is a new issue.
Is there a way to shut off early completion in 1.2.3? Or the the
above a known issues and i should use 1.2.7-pre or grab a 1.3 snap
Center for Advanced Computing
On Jul 2, 2008, at 10:42 AM, Pavel Shamis (Pasha) wrote:
> May be this FAQ will help : http://www.open-mpi.org/faq/?
> Brock Palen wrote:
>> We have a code (arts) that locks up only when running on IB.
>> Works fine on tcp and sm.
>> When we ran it in a debugger. It locked up on a MPI_Comm_split()
>> That as far as I could tell was valid.
>> Because the split was a hack they did to use MPI_File_open() on a
>> single cpu, we reworked it to remove the split. The code then
>> locks up again.
>> This time its locked up on an MPI_Allreduce() Which was really
>> strange. When running on 8 cpus only rank 4 would get sucks. The
>> rest of the ranks are fine and get the right value. (we are using
>> ddt as our debugger).
>> Its very strange. Do you have any idea what could cause this to
>> happen? We are using openmpi-1.2.3/1.2.6 with PGI compilers.
>> Brock Palen
>> Center for Advanced Computing
>> users mailing list
> users mailing list