Open MPI logo

Open MPI User's Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Open MPI User's mailing list

Subject: Re: [OMPI users] ScaLapack and BLACS on Leopard
From: Terry Dontje (Terry.Dontje_at_[hidden])
Date: 2008-03-07 07:26:01


To close out this issue and eat some crow. It ended up the issue I saw
below was caused by a messed up version of BLACS and that using the
scalapack installer to retrieve BLACS and the options mentioned in the
Open MPI FAQ does produce a working BLACS. So there is no need for
changing WHATMPI.

Sorry for the misinformation,

--td

Terry Dontje wrote:
> Ok, I think I found the cause of the SPARC segv when trying to use a
> 64-bit compiled Open MPI library. If one does not set the WHATMPI
> variable in the Bmake.inc it defaults to UseF77Mpi which assumes all
> handles are ints. This is a correct assumption if you are using the
> F77 interfaces but the way BLACS seems to compile for Open MPI it uses
> the C versions. So the handles are stored as 32 bits in BLACS and
> passed to the C Open MPI interfaces which expects 64 bits. In cases
> where your addresses need more than 32 bits this will cause MPI to
> segv when passed an invalid address due to this coersion.
>
> So by setting "WHATMPI= -DUseCMpi" I've gotten the SPARC version of
> BLACS compiled for 64 bits to pass its tests without segv'ing. I do
> believe this issue actually exists for other platforms (ie AMD64 and
> IA64) with other OSes and compilers. Just that we've been lucky that
> MPI_COMM_WORLD is allocated such that it has an address that fits in
> 32 bits. I am amazed still that we haven't seen this fail in user
> codes. Note, I have not confirmed this failure with a test case but
> the code stack in dbx looks the same on X64 platforms as the code on
> SPARC except the address is smaller on the former.
>
> Greg, I would be interested in knowing if you are still seeing the
> problem on Leopard and whether the above setting helps any.
>
> --td
>
> *
>> *Subject:* Re: [OMPI users] ScaLapack and BLACS on Leopard
>> *From:* Terry Dontje (/Terry.Dontje_at_[hidden]/)
>> *Date:* 2008-03-03 07:34:17
> *
>>
>> What kind of system lib errors are you seeing and do you have a stack
>> trace? Note, I was trying something similar with Solaris and 64-bit on
>> a SPARC machine and was seeing segv's inside the MPI Library due to a
>> pointer being passed through an integer (thus dropping the upper 32
>> bits). Funny thing is it all works under Solaris on AMD64 or IA-64
>> platforms.
>>
>> --td
>>
>> > Date: Thu, 28 Feb 2008 17:50:28 -0500
>> > From: Gregory John Orris <gregory.orris_at_[hidden]>
>> > Subject: [OMPI users] ScaLapack and BLACS on Leopard
>> > To: Open MPI Users <users_at_[hidden]>
>> > Message-ID: <528FD4C0-6157-49CB-80E6-1C62684E4545_at_[hidden]>
>> > Content-Type: text/plain; charset="us-ascii"
>> >
>> > Hey Folks,
>> >
>> > Anyone got ScaLapack and BLACS working and not just compiled under
>> > OSX10.5 in 64-bit mode?
>> > The FAQ site directions were followed and every thing compiles just
>> > fine. But ALL of the single precision routines and many of the double
>> > precisions routines in the TESTING directory fail with system lib
>> > errors.
>> >
>> > I've gotten some interesting errors and am wondering what the magic
>> > touch is.
>> >
>> > Regards,
>> > Greg
>> >
>