Open MPI logo

Open MPI User's Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Open MPI User's mailing list

Subject: Re: [OMPI users] Problems building Open MPI 1.4.1 with Pathscale
From: Mostyn Lewis (Mostyn.Lewis_at_[hidden])
Date: 2010-02-09 17:44:30


The old opal_atomic_cmpset_32 worked:

static inline int opal_atomic_cmpset_32( volatile int32_t *addr,
    unsigned char ret;
    __asm__ __volatile__ (
                        SMPLOCK "cmpxchgl %1,%2 \n\t"
                                "sete %0 \n\t"
                        : "=qm" (ret)
                        : "q"(newval), "m"(*addr), "a"(oldval)
                        : "memory");

    return (int)ret;
}

The new opal_atomic_cmpset_32 fails:

static inline int opal_atomic_cmpset_32( volatile int32_t *addr,
                                         int32_t oldval, int32_t newval)
{
    unsigned char ret;
    __asm__ __volatile__ (
                        SMPLOCK "cmpxchgl %3,%4 \n\t"
                                "sete %0 \n\t"
                        : "=qm" (ret), "=a" (oldval), "=m" (*addr)
                        : "q"(newval), "m"(*addr), "1"(oldval)
    return (int)ret;
}

**However** if you put back the "clobber" for memory line (3rd :), it works:

static inline int opal_atomic_cmpset_32( volatile int32_t *addr,
                                         int32_t oldval, int32_t newval)
{
    unsigned char ret;
    __asm__ __volatile__ (
                        SMPLOCK "cmpxchgl %3,%4 \n\t"
                                "sete %0 \n\t"
                        : "=qm" (ret), "=a" (oldval), "=m" (*addr)
                        : "q"(newval), "m"(*addr), "1"(oldval)
                        : "memory");

    return (int)ret;
}

This works in a test case for pathcc, gcc, icc, pgcc, SUN studio cc and open64 (pathscale
lineage - which also fails with 1.4.1).
Also the SMPLOCK above is defined as "lock; " - the ";" is a GNU as statement delimter - is
that right? Seems to work with/without the ";".

Also, a question - I see you generate via perl another "lock" asm file which you put into
opal/asm/generated/<whatever, e.g. atomic-amd64-linux.s> and stick into libasm - what you
generate there for whatever usage hasn't changed 1.4->1.4.1->svn trunk?

DM

On Tue, 9 Feb 2010, Jeff Squyres wrote:

> Perhaps someone with a pathscale compiler support contract can investigate this with them.
>
> Have them contact us if they want/need help understanding our atomics; we're happy to explain, etc. (the atomics are fairly localized to a small part of OMPI).
>
>
>
> On Feb 9, 2010, at 11:42 AM, Mostyn Lewis wrote:
>
>> All,
>>
>> FWIW, Pathscale is dying in the new atomics in 1.4.1 (and svn trunk) - actually looping -
>>
>> from gdb:
>>
>> opal_progress_event_users_decrement () at ../.././opal/include/opal/sys/atomic_impl.h:61
>> 61 } while (0 == opal_atomic_cmpset_32(addr, oldval, oldval - delta));
>> Current language: auto; currently asm
>> (gdb) where
>> #0 opal_progress_event_users_decrement () at ../.././opal/include/opal/sys/atomic_impl.h:61
>> #1 0x0000000000000001 in ?? ()
>> #2 0x00002aec4cf6a5e0 in ?? ()
>> #3 0x00000000000000eb in ?? ()
>> #4 0x00002aec4cfb57e0 in ompi_mpi_init () at ../.././ompi/runtime/ompi_mpi_init.c:818
>> #5 0x00007fff5db3bd58 in ?? ()
>> Backtrace stopped: previous frame inner to this frame (corrupt stack?)
>> (gdb) list
>> 56 {
>> 57 int32_t oldval;
>> 58
>> 59 do {
>> 60 oldval = *addr;
>> 61 } while (0 == opal_atomic_cmpset_32(addr, oldval, oldval - delta));
>> 62 return (oldval - delta);
>> 63 }
>> 64 #endif /* OPAL_HAVE_ATOMIC_SUB_32 */
>> 65
>> (gdb)
>>
>> DM
>>
>> On Tue, 9 Feb 2010, Jeff Squyres wrote:
>>
>>> FWIW, I have had terrible luck with the patschale compiler over the years. Repeated attempts to get support from them -- even when I was a paying customer -- resulted in no help (e.g., a pathCC bug with the OMPI C++ bindings that I filed years ago was never resolved).
>>>
>>> Is this compiler even supported anymore? I.e., is there a support department somewhere that you have a hope of getting any help from?
>>>
>>> I can't say for sure, of course, but if MPI hello world hangs, it smells like a compiler bug. You might want to attach to "hello world" in a debugger and see where it's hung. You might need to compile OMPI with debugging symbols to get any meaningful information.
>>>
>>> ** NOTE: My personal feelings about the pathscale compiler suite do not reflect anyone else's feelings in the Open MPI community. Perhaps someone could change my mind someday, but *I* have personally given up on this compiler. :-(
>>>
>>>
>>> On Feb 8, 2010, at 2:38 AM, Rafael Arco Arredondo wrote:
>>>
>>>> Hello,
>>>>
>>>> It does work with version 1.4. This is the hello world that hangs with
>>>> 1.4.1:
>>>>
>>>> #include <stdio.h>
>>>> #include <mpi.h>
>>>>
>>>> int main(int argc, char **argv)
>>>> {
>>>> int node, size;
>>>>
>>>> MPI_Init(&argc,&argv);
>>>> MPI_Comm_rank(MPI_COMM_WORLD, &node);
>>>> MPI_Comm_size(MPI_COMM_WORLD, &size);
>>>>
>>>> printf("Hello World from Node %d of %d.\n", node, size);
>>>>
>>>> MPI_Finalize();
>>>> return 0;
>>>> }
>>>>
>>>> El mar, 26-01-2010 a las 03:57 -0500, Åke Sandgren escribió:
>>>>> 1 - Do you have problems with openmpi 1.4 too? (I don't, haven't built
>>>>> 1.4.1 yet)
>>>>> 2 - There is a bug in the pathscale compiler with -fPIC and -g that
>>>>> generates incorrect dwarf2 data so debuggers get really confused and
>>>>> will have BIG problems debugging the code. I'm chasing them to get a
>>>>> fix...
>>>>> 3 - Do you have an example code that have problems?
>>>>
>>>> --
>>>> Rafael Arco Arredondo
>>>> Centro de Servicios de Informática y Redes de Comunicaciones
>>>> Universidad de Granada
>>>>
>>>> _______________________________________________
>>>> users mailing list
>>>> users_at_[hidden]
>>>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>>>>
>>>
>>>
>>> --
>>> Jeff Squyres
>>> jsquyres_at_[hidden]
>>>
>>> For corporate legal information go to:
>>> http://www.cisco.com/web/about/doing_business/legal/cri/
>>>
>>>
>>> _______________________________________________
>>> users mailing list
>>> users_at_[hidden]
>>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>>>
>>
>> _______________________________________________
>> users mailing list
>> users_at_[hidden]
>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>
>
> --
> Jeff Squyres
> jsquyres_at_[hidden]
>
> For corporate legal information go to:
> http://www.cisco.com/web/about/doing_business/legal/cri/
>
>
> _______________________________________________
> users mailing list
> users_at_[hidden]
> http://www.open-mpi.org/mailman/listinfo.cgi/users
>