Open MPI logo

Open MPI User's Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Open MPI User's mailing list

Subject: Re: [OMPI users] Problems building Open MPI 1.4.1 with Pathscale
From: Terry Dontje (Terry.Dontje_at_[hidden])
Date: 2010-02-10 05:39:29


Jeff Squyres wrote:
> Iain did the genius for the new assembly. Iain -- can you respond?
>
>
Iain is on vacation right now so he probably want be able to respond
until next week.

--td
> On Feb 9, 2010, at 5:44 PM, Mostyn Lewis wrote:
>
>
>> The old opal_atomic_cmpset_32 worked:
>>
>> static inline int opal_atomic_cmpset_32( volatile int32_t *addr,
>> unsigned char ret;
>> __asm__ __volatile__ (
>> SMPLOCK "cmpxchgl %1,%2 \n\t"
>> "sete %0 \n\t"
>> : "=qm" (ret)
>> : "q"(newval), "m"(*addr), "a"(oldval)
>> : "memory");
>>
>> return (int)ret;
>> }
>>
>> The new opal_atomic_cmpset_32 fails:
>>
>> static inline int opal_atomic_cmpset_32( volatile int32_t *addr,
>> int32_t oldval, int32_t newval)
>> {
>> unsigned char ret;
>> __asm__ __volatile__ (
>> SMPLOCK "cmpxchgl %3,%4 \n\t"
>> "sete %0 \n\t"
>> : "=qm" (ret), "=a" (oldval), "=m" (*addr)
>> : "q"(newval), "m"(*addr), "1"(oldval)
>> return (int)ret;
>> }
>>
>> **However** if you put back the "clobber" for memory line (3rd :), it works:
>>
>> static inline int opal_atomic_cmpset_32( volatile int32_t *addr,
>> int32_t oldval, int32_t newval)
>> {
>> unsigned char ret;
>> __asm__ __volatile__ (
>> SMPLOCK "cmpxchgl %3,%4 \n\t"
>> "sete %0 \n\t"
>> : "=qm" (ret), "=a" (oldval), "=m" (*addr)
>> : "q"(newval), "m"(*addr), "1"(oldval)
>> : "memory");
>>
>> return (int)ret;
>> }
>>
>> This works in a test case for pathcc, gcc, icc, pgcc, SUN studio cc and open64 (pathscale
>> lineage - which also fails with 1.4.1).
>> Also the SMPLOCK above is defined as "lock; " - the ";" is a GNU as statement delimter - is
>> that right? Seems to work with/without the ";".
>>
>>
>> Also, a question - I see you generate via perl another "lock" asm file which you put into
>> opal/asm/generated/<whatever, e.g. atomic-amd64-linux.s> and stick into libasm - what you
>> generate there for whatever usage hasn't changed 1.4->1.4.1->svn trunk?
>>
>> DM
>>
>> On Tue, 9 Feb 2010, Jeff Squyres wrote:
>>
>>
>>> Perhaps someone with a pathscale compiler support contract can investigate this with them.
>>>
>>> Have them contact us if they want/need help understanding our atomics; we're happy to explain, etc. (the atomics are fairly localized to a small part of OMPI).
>>>
>>>
>>>
>>> On Feb 9, 2010, at 11:42 AM, Mostyn Lewis wrote:
>>>
>>>
>>>> All,
>>>>
>>>> FWIW, Pathscale is dying in the new atomics in 1.4.1 (and svn trunk) - actually looping -
>>>>
>>>> from gdb:
>>>>
>>>> opal_progress_event_users_decrement () at ../.././opal/include/opal/sys/atomic_impl.h:61
>>>> 61 } while (0 == opal_atomic_cmpset_32(addr, oldval, oldval - delta));
>>>> Current language: auto; currently asm
>>>> (gdb) where
>>>> #0 opal_progress_event_users_decrement () at ../.././opal/include/opal/sys/atomic_impl.h:61
>>>> #1 0x0000000000000001 in ?? ()
>>>> #2 0x00002aec4cf6a5e0 in ?? ()
>>>> #3 0x00000000000000eb in ?? ()
>>>> #4 0x00002aec4cfb57e0 in ompi_mpi_init () at ../.././ompi/runtime/ompi_mpi_init.c:818
>>>> #5 0x00007fff5db3bd58 in ?? ()
>>>> Backtrace stopped: previous frame inner to this frame (corrupt stack?)
>>>> (gdb) list
>>>> 56 {
>>>> 57 int32_t oldval;
>>>> 58
>>>> 59 do {
>>>> 60 oldval = *addr;
>>>> 61 } while (0 == opal_atomic_cmpset_32(addr, oldval, oldval - delta));
>>>> 62 return (oldval - delta);
>>>> 63 }
>>>> 64 #endif /* OPAL_HAVE_ATOMIC_SUB_32 */
>>>> 65
>>>> (gdb)
>>>>
>>>> DM
>>>>
>>>> On Tue, 9 Feb 2010, Jeff Squyres wrote:
>>>>
>>>>
>>>>> FWIW, I have had terrible luck with the patschale compiler over the years. Repeated attempts to get support from them -- even when I was a paying customer -- resulted in no help (e.g., a pathCC bug with the OMPI C++ bindings that I filed years ago was never resolved).
>>>>>
>>>>> Is this compiler even supported anymore? I.e., is there a support department somewhere that you have a hope of getting any help from?
>>>>>
>>>>> I can't say for sure, of course, but if MPI hello world hangs, it smells like a compiler bug. You might want to attach to "hello world" in a debugger and see where it's hung. You might need to compile OMPI with debugging symbols to get any meaningful information.
>>>>>
>>>>> ** NOTE: My personal feelings about the pathscale compiler suite do not reflect anyone else's feelings in the Open MPI community. Perhaps someone could change my mind someday, but *I* have personally given up on this compiler. :-(
>>>>>
>>>>>
>>>>> On Feb 8, 2010, at 2:38 AM, Rafael Arco Arredondo wrote:
>>>>>
>>>>>
>>>>>> Hello,
>>>>>>
>>>>>> It does work with version 1.4. This is the hello world that hangs with
>>>>>> 1.4.1:
>>>>>>
>>>>>> #include <stdio.h>
>>>>>> #include <mpi.h>
>>>>>>
>>>>>> int main(int argc, char **argv)
>>>>>> {
>>>>>> int node, size;
>>>>>>
>>>>>> MPI_Init(&argc,&argv);
>>>>>> MPI_Comm_rank(MPI_COMM_WORLD, &node);
>>>>>> MPI_Comm_size(MPI_COMM_WORLD, &size);
>>>>>>
>>>>>> printf("Hello World from Node %d of %d.\n", node, size);
>>>>>>
>>>>>> MPI_Finalize();
>>>>>> return 0;
>>>>>> }
>>>>>>
>>>>>> El mar, 26-01-2010 a las 03:57 -0500, Åke Sandgren escribió:
>>>>>>
>>>>>>> 1 - Do you have problems with openmpi 1.4 too? (I don't, haven't built
>>>>>>> 1.4.1 yet)
>>>>>>> 2 - There is a bug in the pathscale compiler with -fPIC and -g that
>>>>>>> generates incorrect dwarf2 data so debuggers get really confused and
>>>>>>> will have BIG problems debugging the code. I'm chasing them to get a
>>>>>>> fix...
>>>>>>> 3 - Do you have an example code that have problems?
>>>>>>>
>>>>>> --
>>>>>> Rafael Arco Arredondo
>>>>>> Centro de Servicios de Informática y Redes de Comunicaciones
>>>>>> Universidad de Granada
>>>>>>
>>>>>> _______________________________________________
>>>>>> users mailing list
>>>>>> users_at_[hidden]
>>>>>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>>>>>>
>>>>>>
>>>>> --
>>>>> Jeff Squyres
>>>>> jsquyres_at_[hidden]
>>>>>
>>>>> For corporate legal information go to:
>>>>> http://www.cisco.com/web/about/doing_business/legal/cri/
>>>>>
>>>>>
>>>>> _______________________________________________
>>>>> users mailing list
>>>>> users_at_[hidden]
>>>>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>>>>>
>>>>>
>>>> _______________________________________________
>>>> users mailing list
>>>> users_at_[hidden]
>>>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>>>>
>>> --
>>> Jeff Squyres
>>> jsquyres_at_[hidden]
>>>
>>> For corporate legal information go to:
>>> http://www.cisco.com/web/about/doing_business/legal/cri/
>>>
>>>
>>> _______________________________________________
>>> users mailing list
>>> users_at_[hidden]
>>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>>>
>>>
>> _______________________________________________
>> users mailing list
>> users_at_[hidden]
>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>>
>
>
>