Open MPI logo

Open MPI Development Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Development mailing list

From: Jeff Squyres (jsquyres_at_[hidden])
Date: 2007-10-22 16:45:21


On Oct 22, 2007, at 2:49 PM, Bogdan Costescu wrote:

> <short version>
> Is there some known incompatibility of the latest stable versions with
> the PathScale 3.0 compilers ?
> </short version>

There is in the openib BTL. We've had an open issue with PathScale
for many months. They're able to reproduce the error and have
narrowed it down to a single .o file, but have not yet found the
specific problem (that was the last I heard a few months ago).

To be honest, I removed the pathscale suite from my regular
regression testing many months ago because of this long-standing
problem; I don't know if any other pathscale-specific issues have
crept in since then.

> <long version>
> I have a very puzzling problem with the following combination:
> - PathScale 3.0 suite
> - Open MPI 1.2.3 and 1.2.4 (both behave the same)
> - Debian etch, kernel 2.6.22.9/x86_64 running on AMD Opteron

I just recompiled the OMPI 1.2 branch with pathscale 3.0 on RHEL4U4
and I do not see the problems that you are seeing. :-\ Is Debian
etch a supported pathscale platform?

[13:44] svbu-mpi:/home/jsquyres/openmpi-1.2.4 % ompi_info
                 Open MPI: 1.2.4
    Open MPI SVN revision: r16187
                 Open RTE: 1.2.4
    Open RTE SVN revision: r16187
                     OPAL: 1.2.4
        OPAL SVN revision: r16187
                   Prefix: /home/jsquyres/bogus
Configured architecture: x86_64-unknown-linux-gnu
            Configured by: jsquyres
            Configured on: Mon Oct 22 13:34:17 PDT 2007
           Configure host: svbu-mpi.cisco.com
                 Built by: jsquyres
                 Built on: Mon Oct 22 13:40:55 PDT 2007
               Built host: svbu-mpi.cisco.com
               C bindings: yes
             C++ bindings: yes
       Fortran77 bindings: yes (all)
       Fortran90 bindings: yes
Fortran90 bindings size: small
               C compiler: pathcc
      C compiler absolute: /opt/pathscale/3.0/bin/pathcc
             C++ compiler: pathCC
    C++ compiler absolute: /opt/pathscale/3.0/bin/pathCC
       Fortran77 compiler: pathf90
   Fortran77 compiler abs: /opt/pathscale/3.0/bin/pathf90
       Fortran90 compiler: pathf90
   Fortran90 compiler abs: /opt/pathscale/3.0/bin/pathf90
              C profiling: yes
....etc.

> Upon invoking any installed binary (opmi_info, mpif90 --showinfo), I
> get a segmentation fault. The trace looks strange (to me, at
> least ;-)):
>
> Program terminated with signal 11, Segmentation fault.
> #0 0x00000000004430d9 in _int_free (av=0x5b1ea0, mem=0x5b40b0) at /
> home/thor1/costescu/build/openmpi-1.2.4/opal/mca/memory/ptmalloc2/
> malloc.c:4416
> 4416 fwd->bk = p;
> (gdb) bt
> #0 0x00000000004430d9 in _int_free (av=0x5b1ea0, mem=0x5b40b0) at /
> home/thor1/costescu/build/openmpi-1.2.4/opal/mca/memory/ptmalloc2/
> malloc.c:4416
> #1 0x000000000044141b in free (mem=0x5b40b0) at /home/thor1/
> costescu/build/openmpi-1.2.4/opal/mca/memory/ptmalloc2/malloc.c:3513
> #2 0x00002b27dc920590 in vasprintf () from /lib/libc.so.6
> #3 0x00002b27dc906588 in asprintf () from /lib/libc.so.6
> #4 0x0000000000421274 in opal_output_init () at /home/thor1/
> costescu/build/openmpi-1.2.4/opal/util/output.c:130
> #5 0x0000000000421c83 in do_open (output_id=-1, lds=0x591530) at /
> home/thor1/costescu/build/openmpi-1.2.4/opal/util/output.c:422
> #6 0x0000000000421529 in opal_output_open (lds=0x591530) at /home/
> thor1/costescu/build/openmpi-1.2.4/opal/util/output.c:176
> #7 0x00000000004201e4 in opal_malloc_init () at /home/thor1/
> costescu/build/openmpi-1.2.4/opal/util/malloc.c:67
> #8 0x000000000040e6ac in opal_init_util () at runtime/opal_init.c:137
> #9 0x000000000040932e in main (argc=2, argv=0x7fffceb02608) at /
> home/thor1/costescu/build/openmpi-1.2.4/opal/tools/wrappers/
> opal_wrapper.c:424
>
> This happens only with the PathScale 3.0 compilers; I have no problems
> when using the default gcc and friends version 4.1.2 compilers; I also
> have no problems in using the PathScale 3.0 compilers either alone or
> with Myricom's MPICH/MX.
>
> The problem build was obtained after:
>
> ./configure --prefix=/home/thor1/costescu/openmpi-1.2.4-ps30 --
> enable-static --disable-shared --with-mx=/opt_local/mx --disable-io-
> romio --enable-debug --enable-pretty-print-stacktrace
>
> (configure and make logs available on request)
>
> I thought about asking here first to avoid any 'this is known' or
> embarassing errors that I might have made, before filling a bug
> report. The existing bugs related to PathScale compilers don't seem
> to describe the symptoms that I'm seeing, unless it's some kind of
> threading issue which seems to have no resolution yet...
>
> Thanks in advance !
> </long version>
>
> --
> Bogdan Costescu
>
> IWR - Interdisziplinaeres Zentrum fuer Wissenschaftliches Rechnen
> Universitaet Heidelberg, INF 368, D-69120 Heidelberg, GERMANY
> Telephone: +49 6221 54 8869, Telefax: +49 6221 54 8868
> E-mail: Bogdan.Costescu_at_[hidden]
> _______________________________________________
> devel mailing list
> devel_at_[hidden]
> http://www.open-mpi.org/mailman/listinfo.cgi/devel

-- 
Jeff Squyres
Cisco Systems