Open MPI logo

Open MPI User's Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Open MPI User's mailing list

Subject: Re: [OMPI users] segfault when combining OpenMPI and GotoBLAS2
From: Eloi Gaudry (eg_at_[hidden])
Date: 2010-01-18 11:49:43


Dorian Krause wrote:
> Hi Eloi,
>>
>> Does the segmentation faults you're facing also happen in a
>> sequential environment (i.e. not linked against openmpi libraries) ?
>
> No, without MPI everything works fine. Also, linking against mvapich
> doesn't give any errors. I think there is a problem with GotoBLAS and
> the shared library infrastructure of OpenMPI. The code doesn't come to
> the point to execute the gemm operation at all.
>
>> Have you already informed Kazushige Goto (developer of Gotoblas) ?
>
> Not yet. Since the problem only happens with openmpi and the BLAS
> (stand-alone) seems to work, I thought the openmpi mailing list would
> be the better place to discuss this (to get a grasp of what the error
> could be before going to the GotoBLAS mailing list).
>
>>
>> Regards,
>> Eloi
>>
>> PS: Could you post your Makefile.rule here so that we could check the
>> different compilation options chosen ?
>
> I didn't make any changes to the Makefile.rules. This is the content
> of Makefile.conf:
>
> OSNAME=Linux
> ARCH=x86_64
> C_COMPILER=GCC
> BINARY32=
> BINARY64=1
> CEXTRALIB=-L/usr/lib/gcc/x86_64-redhat-linux/4.1.2
> -L/usr/lib/gcc/x86_64-redhat-linux/4.1.2
> -L/usr/lib/gcc/x86_64-redhat-linux/4.1.2/../../../../lib64
> -L/lib/../lib64 -L/usr/lib/../lib64 -lc
> F_COMPILER=GFORTRAN
> FC=gfortran
> BU=_
> FEXTRALIB=-L/usr/lib/gcc/x86_64-redhat-linux/4.1.2
> -L/usr/lib/gcc/x86_64-redhat-linux/4.1.2
> -L/usr/lib/gcc/x86_64-redhat-linux/4.1.2/../../../../lib64
> -L/lib/../lib64 -L/usr/lib/../lib64 -lgfortran -lm -lgfortran -lm -lc
> CORE=BARCELONA
> LIBCORE=barcelona
> NUM_CORES=8
> HAVE_MMX=1
> HAVE_SSE=1
> HAVE_SSE2=1
> HAVE_SSE3=1
> HAVE_SSE4A=1
> HAVE_3DNOWEX=1
> HAVE_3DNOW=1
> MAKE += -j 8
> SGEMM_UNROLL_M=8
> SGEMM_UNROLL_N=4
> DGEMM_UNROLL_M=4
> DGEMM_UNROLL_N=4
> QGEMM_UNROLL_M=2
> QGEMM_UNROLL_N=2
> CGEMM_UNROLL_M=4
> CGEMM_UNROLL_N=2
> ZGEMM_UNROLL_M=2
> ZGEMM_UNROLL_N=2
> XGEMM_UNROLL_M=1
> XGEMM_UNROLL_N=1
>
>
> Thanks,
> Dorian
>
Dorian,

I've been experiencing similar issue on two different Opteron
architectures (22xx and 25x), in a sequential environment, when using
v2-1.10 of GotoBLAS. If you can downgrade to version 2-1.09, I bet you
will not experience such issues. Anyway, I'm pretty sure Kazushige is
working on fixing this right now.

Eloi