Open MPI logo

Open MPI User's Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Open MPI User's mailing list

Subject: Re: [OMPI users] Problems building Open MPI 1.4.1 with Pathscale
From: Iain Bason (Iain.Bason_at_[hidden])
Date: 2010-02-09 19:29:25


Well, I am by no means an expert on the GNU-style asm directives. I
believe someone else (George Bosilca?) tweaked what I had suggested.

That being said, I think the memory "clobber" is harmless.

Iain

On Feb 9, 2010, at 5:51 PM, Jeff Squyres wrote:

> Iain did the genius for the new assembly. Iain -- can you respond?
>
>
> On Feb 9, 2010, at 5:44 PM, Mostyn Lewis wrote:
>
>> The old opal_atomic_cmpset_32 worked:
>>
>> static inline int opal_atomic_cmpset_32( volatile int32_t *addr,
>> unsigned char ret;
>> __asm__ __volatile__ (
>> SMPLOCK "cmpxchgl %1,%2 \n\t"
>> "sete %0 \n\t"
>> : "=qm" (ret)
>> : "q"(newval), "m"(*addr), "a"(oldval)
>> : "memory");
>>
>> return (int)ret;
>> }
>>
>> The new opal_atomic_cmpset_32 fails:
>>
>> static inline int opal_atomic_cmpset_32( volatile int32_t *addr,
>> int32_t oldval, int32_t
>> newval)
>> {
>> unsigned char ret;
>> __asm__ __volatile__ (
>> SMPLOCK "cmpxchgl %3,%4 \n\t"
>> "sete %0 \n\t"
>> : "=qm" (ret), "=a" (oldval), "=m" (*addr)
>> : "q"(newval), "m"(*addr), "1"(oldval)
>> return (int)ret;
>> }
>>
>> **However** if you put back the "clobber" for memory line (3rd :),
>> it works:
>>
>> static inline int opal_atomic_cmpset_32( volatile int32_t *addr,
>> int32_t oldval, int32_t
>> newval)
>> {
>> unsigned char ret;
>> __asm__ __volatile__ (
>> SMPLOCK "cmpxchgl %3,%4 \n\t"
>> "sete %0 \n\t"
>> : "=qm" (ret), "=a" (oldval), "=m" (*addr)
>> : "q"(newval), "m"(*addr), "1"(oldval)
>> : "memory");
>>
>> return (int)ret;
>> }
>>
>> This works in a test case for pathcc, gcc, icc, pgcc, SUN studio cc
>> and open64 (pathscale
>> lineage - which also fails with 1.4.1).
>> Also the SMPLOCK above is defined as "lock; " - the ";" is a GNU as
>> statement delimter - is
>> that right? Seems to work with/without the ";".
>>
>>
>> Also, a question - I see you generate via perl another "lock" asm
>> file which you put into
>> opal/asm/generated/<whatever, e.g. atomic-amd64-linux.s> and stick
>> into libasm - what you
>> generate there for whatever usage hasn't changed 1.4->1.4.1->svn
>> trunk?
>>
>> DM
>>
>> On Tue, 9 Feb 2010, Jeff Squyres wrote:
>>
>>> Perhaps someone with a pathscale compiler support contract can
>>> investigate this with them.
>>>
>>> Have them contact us if they want/need help understanding our
>>> atomics; we're happy to explain, etc. (the atomics are fairly
>>> localized to a small part of OMPI).
>>>
>>>
>>>
>>> On Feb 9, 2010, at 11:42 AM, Mostyn Lewis wrote:
>>>
>>>> All,
>>>>
>>>> FWIW, Pathscale is dying in the new atomics in 1.4.1 (and svn
>>>> trunk) - actually looping -
>>>>
>>>> from gdb:
>>>>
>>>> opal_progress_event_users_decrement () at ../.././opal/include/
>>>> opal/sys/atomic_impl.h:61
>>>> 61 } while (0 == opal_atomic_cmpset_32(addr, oldval,
>>>> oldval - delta));
>>>> Current language: auto; currently asm
>>>> (gdb) where
>>>> #0 opal_progress_event_users_decrement () at ../.././opal/
>>>> include/opal/sys/atomic_impl.h:61
>>>> #1 0x0000000000000001 in ?? ()
>>>> #2 0x00002aec4cf6a5e0 in ?? ()
>>>> #3 0x00000000000000eb in ?? ()
>>>> #4 0x00002aec4cfb57e0 in ompi_mpi_init () at ../.././ompi/
>>>> runtime/ompi_mpi_init.c:818
>>>> #5 0x00007fff5db3bd58 in ?? ()
>>>> Backtrace stopped: previous frame inner to this frame (corrupt
>>>> stack?)
>>>> (gdb) list
>>>> 56 {
>>>> 57 int32_t oldval;
>>>> 58
>>>> 59 do {
>>>> 60 oldval = *addr;
>>>> 61 } while (0 == opal_atomic_cmpset_32(addr, oldval,
>>>> oldval - delta));
>>>> 62 return (oldval - delta);
>>>> 63 }
>>>> 64 #endif /* OPAL_HAVE_ATOMIC_SUB_32 */
>>>> 65
>>>> (gdb)
>>>>
>>>> DM
>>>>
>>>> On Tue, 9 Feb 2010, Jeff Squyres wrote:
>>>>
>>>>> FWIW, I have had terrible luck with the patschale compiler over
>>>>> the years. Repeated attempts to get support from them -- even
>>>>> when I was a paying customer -- resulted in no help (e.g., a
>>>>> pathCC bug with the OMPI C++ bindings that I filed years ago was
>>>>> never resolved).
>>>>>
>>>>> Is this compiler even supported anymore? I.e., is there a
>>>>> support department somewhere that you have a hope of getting any
>>>>> help from?
>>>>>
>>>>> I can't say for sure, of course, but if MPI hello world hangs,
>>>>> it smells like a compiler bug. You might want to attach to
>>>>> "hello world" in a debugger and see where it's hung. You might
>>>>> need to compile OMPI with debugging symbols to get any
>>>>> meaningful information.
>>>>>
>>>>> ** NOTE: My personal feelings about the pathscale compiler suite
>>>>> do not reflect anyone else's feelings in the Open MPI
>>>>> community. Perhaps someone could change my mind someday, but
>>>>> *I* have personally given up on this compiler. :-(
>>>>>
>>>>>
>>>>> On Feb 8, 2010, at 2:38 AM, Rafael Arco Arredondo wrote:
>>>>>
>>>>>> Hello,
>>>>>>
>>>>>> It does work with version 1.4. This is the hello world that
>>>>>> hangs with
>>>>>> 1.4.1:
>>>>>>
>>>>>> #include <stdio.h>
>>>>>> #include <mpi.h>
>>>>>>
>>>>>> int main(int argc, char **argv)
>>>>>> {
>>>>>> int node, size;
>>>>>>
>>>>>> MPI_Init(&argc,&argv);
>>>>>> MPI_Comm_rank(MPI_COMM_WORLD, &node);
>>>>>> MPI_Comm_size(MPI_COMM_WORLD, &size);
>>>>>>
>>>>>> printf("Hello World from Node %d of %d.\n", node, size);
>>>>>>
>>>>>> MPI_Finalize();
>>>>>> return 0;
>>>>>> }
>>>>>>
>>>>>> El mar, 26-01-2010 a las 03:57 -0500, Åke Sandgren escribió:
>>>>>>> 1 - Do you have problems with openmpi 1.4 too? (I don't,
>>>>>>> haven't built
>>>>>>> 1.4.1 yet)
>>>>>>> 2 - There is a bug in the pathscale compiler with -fPIC and -g
>>>>>>> that
>>>>>>> generates incorrect dwarf2 data so debuggers get really
>>>>>>> confused and
>>>>>>> will have BIG problems debugging the code. I'm chasing them to
>>>>>>> get a
>>>>>>> fix...
>>>>>>> 3 - Do you have an example code that have problems?
>>>>>>
>>>>>> --
>>>>>> Rafael Arco Arredondo
>>>>>> Centro de Servicios de Informática y Redes de Comunicaciones
>>>>>> Universidad de Granada
>>>>>>
>>>>>> _______________________________________________
>>>>>> users mailing list
>>>>>> users_at_[hidden]
>>>>>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>>>>>>
>>>>>
>>>>>
>>>>> --
>>>>> Jeff Squyres
>>>>> jsquyres_at_[hidden]
>>>>>
>>>>> For corporate legal information go to:
>>>>> http://www.cisco.com/web/about/doing_business/legal/cri/
>>>>>
>>>>>
>>>>> _______________________________________________
>>>>> users mailing list
>>>>> users_at_[hidden]
>>>>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>>>>>
>>>>
>>>> _______________________________________________
>>>> users mailing list
>>>> users_at_[hidden]
>>>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>>>
>>>
>>> --
>>> Jeff Squyres
>>> jsquyres_at_[hidden]
>>>
>>> For corporate legal information go to:
>>> http://www.cisco.com/web/about/doing_business/legal/cri/
>>>
>>>
>>> _______________________________________________
>>> users mailing list
>>> users_at_[hidden]
>>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>>>
>>
>> _______________________________________________
>> users mailing list
>> users_at_[hidden]
>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>
>
> --
> Jeff Squyres
> jsquyres_at_[hidden]
>
> For corporate legal information go to:
> http://www.cisco.com/web/about/doing_business/legal/cri/
>
>
> _______________________________________________
> users mailing list
> users_at_[hidden]
> http://www.open-mpi.org/mailman/listinfo.cgi/users