Open MPI logo

Open MPI User's Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Open MPI User's mailing list

Subject: Re: [OMPI users] Problems building Open MPI 1.4.1 with Pathscale
From: Jeff Squyres (jsquyres_at_[hidden])
Date: 2010-02-09 17:51:44


Iain did the genius for the new assembly. Iain -- can you respond?

On Feb 9, 2010, at 5:44 PM, Mostyn Lewis wrote:

> The old opal_atomic_cmpset_32 worked:
>
> static inline int opal_atomic_cmpset_32( volatile int32_t *addr,
> unsigned char ret;
> __asm__ __volatile__ (
> SMPLOCK "cmpxchgl %1,%2 \n\t"
> "sete %0 \n\t"
> : "=qm" (ret)
> : "q"(newval), "m"(*addr), "a"(oldval)
> : "memory");
>
> return (int)ret;
> }
>
> The new opal_atomic_cmpset_32 fails:
>
> static inline int opal_atomic_cmpset_32( volatile int32_t *addr,
> int32_t oldval, int32_t newval)
> {
> unsigned char ret;
> __asm__ __volatile__ (
> SMPLOCK "cmpxchgl %3,%4 \n\t"
> "sete %0 \n\t"
> : "=qm" (ret), "=a" (oldval), "=m" (*addr)
> : "q"(newval), "m"(*addr), "1"(oldval)
> return (int)ret;
> }
>
> **However** if you put back the "clobber" for memory line (3rd :), it works:
>
> static inline int opal_atomic_cmpset_32( volatile int32_t *addr,
> int32_t oldval, int32_t newval)
> {
> unsigned char ret;
> __asm__ __volatile__ (
> SMPLOCK "cmpxchgl %3,%4 \n\t"
> "sete %0 \n\t"
> : "=qm" (ret), "=a" (oldval), "=m" (*addr)
> : "q"(newval), "m"(*addr), "1"(oldval)
> : "memory");
>
> return (int)ret;
> }
>
> This works in a test case for pathcc, gcc, icc, pgcc, SUN studio cc and open64 (pathscale
> lineage - which also fails with 1.4.1).
> Also the SMPLOCK above is defined as "lock; " - the ";" is a GNU as statement delimter - is
> that right? Seems to work with/without the ";".
>
>
> Also, a question - I see you generate via perl another "lock" asm file which you put into
> opal/asm/generated/<whatever, e.g. atomic-amd64-linux.s> and stick into libasm - what you
> generate there for whatever usage hasn't changed 1.4->1.4.1->svn trunk?
>
> DM
>
> On Tue, 9 Feb 2010, Jeff Squyres wrote:
>
> > Perhaps someone with a pathscale compiler support contract can investigate this with them.
> >
> > Have them contact us if they want/need help understanding our atomics; we're happy to explain, etc. (the atomics are fairly localized to a small part of OMPI).
> >
> >
> >
> > On Feb 9, 2010, at 11:42 AM, Mostyn Lewis wrote:
> >
> >> All,
> >>
> >> FWIW, Pathscale is dying in the new atomics in 1.4.1 (and svn trunk) - actually looping -
> >>
> >> from gdb:
> >>
> >> opal_progress_event_users_decrement () at ../.././opal/include/opal/sys/atomic_impl.h:61
> >> 61 } while (0 == opal_atomic_cmpset_32(addr, oldval, oldval - delta));
> >> Current language: auto; currently asm
> >> (gdb) where
> >> #0 opal_progress_event_users_decrement () at ../.././opal/include/opal/sys/atomic_impl.h:61
> >> #1 0x0000000000000001 in ?? ()
> >> #2 0x00002aec4cf6a5e0 in ?? ()
> >> #3 0x00000000000000eb in ?? ()
> >> #4 0x00002aec4cfb57e0 in ompi_mpi_init () at ../.././ompi/runtime/ompi_mpi_init.c:818
> >> #5 0x00007fff5db3bd58 in ?? ()
> >> Backtrace stopped: previous frame inner to this frame (corrupt stack?)
> >> (gdb) list
> >> 56 {
> >> 57 int32_t oldval;
> >> 58
> >> 59 do {
> >> 60 oldval = *addr;
> >> 61 } while (0 == opal_atomic_cmpset_32(addr, oldval, oldval - delta));
> >> 62 return (oldval - delta);
> >> 63 }
> >> 64 #endif /* OPAL_HAVE_ATOMIC_SUB_32 */
> >> 65
> >> (gdb)
> >>
> >> DM
> >>
> >> On Tue, 9 Feb 2010, Jeff Squyres wrote:
> >>
> >>> FWIW, I have had terrible luck with the patschale compiler over the years. Repeated attempts to get support from them -- even when I was a paying customer -- resulted in no help (e.g., a pathCC bug with the OMPI C++ bindings that I filed years ago was never resolved).
> >>>
> >>> Is this compiler even supported anymore? I.e., is there a support department somewhere that you have a hope of getting any help from?
> >>>
> >>> I can't say for sure, of course, but if MPI hello world hangs, it smells like a compiler bug. You might want to attach to "hello world" in a debugger and see where it's hung. You might need to compile OMPI with debugging symbols to get any meaningful information.
> >>>
> >>> ** NOTE: My personal feelings about the pathscale compiler suite do not reflect anyone else's feelings in the Open MPI community. Perhaps someone could change my mind someday, but *I* have personally given up on this compiler. :-(
> >>>
> >>>
> >>> On Feb 8, 2010, at 2:38 AM, Rafael Arco Arredondo wrote:
> >>>
> >>>> Hello,
> >>>>
> >>>> It does work with version 1.4. This is the hello world that hangs with
> >>>> 1.4.1:
> >>>>
> >>>> #include <stdio.h>
> >>>> #include <mpi.h>
> >>>>
> >>>> int main(int argc, char **argv)
> >>>> {
> >>>> int node, size;
> >>>>
> >>>> MPI_Init(&argc,&argv);
> >>>> MPI_Comm_rank(MPI_COMM_WORLD, &node);
> >>>> MPI_Comm_size(MPI_COMM_WORLD, &size);
> >>>>
> >>>> printf("Hello World from Node %d of %d.\n", node, size);
> >>>>
> >>>> MPI_Finalize();
> >>>> return 0;
> >>>> }
> >>>>
> >>>> El mar, 26-01-2010 a las 03:57 -0500, Åke Sandgren escribió:
> >>>>> 1 - Do you have problems with openmpi 1.4 too? (I don't, haven't built
> >>>>> 1.4.1 yet)
> >>>>> 2 - There is a bug in the pathscale compiler with -fPIC and -g that
> >>>>> generates incorrect dwarf2 data so debuggers get really confused and
> >>>>> will have BIG problems debugging the code. I'm chasing them to get a
> >>>>> fix...
> >>>>> 3 - Do you have an example code that have problems?
> >>>>
> >>>> --
> >>>> Rafael Arco Arredondo
> >>>> Centro de Servicios de Informática y Redes de Comunicaciones
> >>>> Universidad de Granada
> >>>>
> >>>> _______________________________________________
> >>>> users mailing list
> >>>> users_at_[hidden]
> >>>> http://www.open-mpi.org/mailman/listinfo.cgi/users
> >>>>
> >>>
> >>>
> >>> --
> >>> Jeff Squyres
> >>> jsquyres_at_[hidden]
> >>>
> >>> For corporate legal information go to:
> >>> http://www.cisco.com/web/about/doing_business/legal/cri/
> >>>
> >>>
> >>> _______________________________________________
> >>> users mailing list
> >>> users_at_[hidden]
> >>> http://www.open-mpi.org/mailman/listinfo.cgi/users
> >>>
> >>
> >> _______________________________________________
> >> users mailing list
> >> users_at_[hidden]
> >> http://www.open-mpi.org/mailman/listinfo.cgi/users
> >
> >
> > --
> > Jeff Squyres
> > jsquyres_at_[hidden]
> >
> > For corporate legal information go to:
> > http://www.cisco.com/web/about/doing_business/legal/cri/
> >
> >
> > _______________________________________________
> > users mailing list
> > users_at_[hidden]
> > http://www.open-mpi.org/mailman/listinfo.cgi/users
> >
>
> _______________________________________________
> users mailing list
> users_at_[hidden]
> http://www.open-mpi.org/mailman/listinfo.cgi/users

-- 
Jeff Squyres
jsquyres_at_[hidden]
For corporate legal information go to:
http://www.cisco.com/web/about/doing_business/legal/cri/