Open MPI logo

Open MPI User's Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Open MPI User's mailing list

Subject: Re: [OMPI users] problems with mpiJava in openmpi-1.9a1r27362
From: Siegmar Gross (Siegmar.Gross_at_[hidden])
Date: 2012-09-26 10:30:59


Hi,

> I'm on the road the rest of this week, but can look at this when I return
> next week. It looks like something unrelated to the Java bindings failed to
> properly initialize - at a guess, I'd suspect that you are missing the
> LD_LIBRARY_PATH setting so none of the OMPI libs were found.

Perhaps the output of my environment program is helpful in that case.
I attached my environment.

mpiexec -np 4 -host linpc4,sunpc4,rs0 environ_mpi \
>& env_linpc_sunpc_sparc.txt

Thank you very much for your help in advance.

Kind regards

Siegmar

> On Wed, Sep 26, 2012 at 5:42 AM, Siegmar Gross <
> Siegmar.Gross_at_[hidden]> wrote:
>
> > Hi,
> >
> > yesterday I installed openmpi-1.9a1r27362 on Solaris and Linux and
> > I have a problem with mpiJava on Linux (openSUSE-Linux 12.1, x86_64).
> >
> >
> > linpc4 mpi_classfiles 104 javac HelloMainWithoutMPI.java
> > linpc4 mpi_classfiles 105 mpijavac HelloMainWithBarrier.java
> > linpc4 mpi_classfiles 106 mpijavac -showme
> > /usr/local/jdk1.7.0_07-64/bin/javac \
> > -cp ...:.:/usr/local/openmpi-1.9_64_cc/lib64/mpi.jar
> >
> >
> > It works with Java without MPI.
> >
> > linpc4 mpi_classfiles 107 mpiexec java -cp $HOME/mpi_classfiles \
> > HelloMainWithoutMPI
> > Hello from linpc4.informatik.hs-fulda.de/193.174.26.225
> >
> >
> > It breaks with Java and MPI.
> >
> > linpc4 mpi_classfiles 108 mpiexec java -cp $HOME/mpi_classfiles \
> > HelloMainWithBarrier
> > --------------------------------------------------------------------------
> > It looks like opal_init failed for some reason; your parallel process is
> > likely to abort. There are many reasons that a parallel process can
> > fail during opal_init; some of which are due to configuration or
> > environment problems. This failure appears to be an internal failure;
> > here's some additional information (which may only be relevant to an
> > Open MPI developer):
> >
> > mca_base_open failed
> > --> Returned value -2 instead of OPAL_SUCCESS
> > --------------------------------------------------------------------------
> > --------------------------------------------------------------------------
> > It looks like orte_init failed for some reason; your parallel process is
> > likely to abort. There are many reasons that a parallel process can
> > fail during orte_init; some of which are due to configuration or
> > environment problems. This failure appears to be an internal failure;
> > here's some additional information (which may only be relevant to an
> > Open MPI developer):
> >
> > opal_init failed
> > --> Returned value Out of resource (-2) instead of ORTE_SUCCESS
> > --------------------------------------------------------------------------
> > --------------------------------------------------------------------------
> > It looks like MPI_INIT failed for some reason; your parallel process is
> > likely to abort. There are many reasons that a parallel process can
> > fail during MPI_INIT; some of which are due to configuration or environment
> > problems. This failure appears to be an internal failure; here's some
> > additional information (which may only be relevant to an Open MPI
> > developer):
> >
> > ompi_mpi_init: orte_init failed
> > --> Returned "Out of resource" (-2) instead of "Success" (0)
> > --------------------------------------------------------------------------
> > *** An error occurred in MPI_Init
> > *** on a NULL communicator
> > *** MPI_ERRORS_ARE_FATAL (processes in this communicator will now abort,
> > *** and potentially your MPI job)
> > [linpc4:15332] Local abort before MPI_INIT completed successfully; not
> > able to
> > aggregate error messages, and not able to guarantee that all other
> > processes were
> > killed!
> > -------------------------------------------------------
> > Primary job terminated normally, but 1 process returned
> > a non-zero exit code.. Per user-direction, the job has been aborted.
> > -------------------------------------------------------
> > --------------------------------------------------------------------------
> > mpiexec detected that one or more processes exited with non-zero status,
> > thus
> > causing
> > the job to be terminated. The first process to do so was:
> >
> > Process name: [[58875,1],0]
> > Exit code: 1
> > --------------------------------------------------------------------------
> >
> >
> > I configured with the following command.
> >
> > ../openmpi-1.9a1r27362/configure --prefix=/usr/local/openmpi-1.9_64_cc \
> > --libdir=/usr/local/openmpi-1.9_64_cc/lib64 \
> > --with-jdk-bindir=/usr/local/jdk1.7.0_07-64/bin \
> > --with-jdk-headers=/usr/local/jdk1.7.0_07-64/include \
> > JAVA_HOME=/usr/local/jdk1.7.0_07-64 \
> > LDFLAGS="-m64" \
> > CC="cc" CXX="CC" FC="f95" \
> > CFLAGS="-m64" CXXFLAGS="-m64 -library=stlport4" FCFLAGS="-m64" \
> > CPP="cpp" CXXCPP="cpp" \
> > CPPFLAGS="" CXXCPPFLAGS="" \
> > C_INCL_PATH="" C_INCLUDE_PATH="" CPLUS_INCLUDE_PATH="" \
> > OBJC_INCLUDE_PATH="" OPENMPI_HOME="" \
> > --enable-cxx-exceptions \
> > --enable-mpi-java \
> > --enable-heterogeneous \
> > --enable-opal-multi-threads \
> > --enable-mpi-thread-multiple \
> > --with-threads=posix \
> > --with-hwloc=internal \
> > --without-verbs \
> > --without-udapl \
> > --with-wrapper-cflags=-m64 \
> > --enable-debug \
> > |& tee log.configure.$SYSTEM_ENV.$MACHINE_ENV.64_cc
> >
> >
> > It works fine on Solaris machines as long as the hosts belong to the
> > same kind (Sparc or x86_64).
> >
> > tyr mpi_classfiles 194 mpiexec -host sunpc0,sunpc1,sunpc4 \
> > java -cp $HOME/mpi_classfiles HelloMainWithBarrier
> > Process 1 of 3 running on sunpc1
> > Process 2 of 3 running on sunpc4.informatik.hs-fulda.de
> > Process 0 of 3 running on sunpc0
> >
> > sunpc4 fd1026 107 mpiexec -host tyr,rs0,rs1 \
> > java -cp $HOME/mpi_classfiles HelloMainWithBarrier
> > Process 1 of 3 running on rs0.informatik.hs-fulda.de
> > Process 2 of 3 running on rs1.informatik.hs-fulda.de
> > Process 0 of 3 running on tyr.informatik.hs-fulda.de
> >
> >
> > It breaks if the hosts belong to both kinds of machines.
> >
> > sunpc4 fd1026 106 mpiexec -host tyr,rs0,sunpc1 \
> > java -cp $HOME/mpi_classfiles HelloMainWithBarrier
> > [rs0.informatik.hs-fulda.de:7718] *** An error occurred in MPI_Comm_dup
> > [rs0.informatik.hs-fulda.de:7718] *** reported by process [565116929,1]
> > [rs0.informatik.hs-fulda.de:7718] *** on communicator MPI_COMM_WORLD
> > [rs0.informatik.hs-fulda.de:7718] *** MPI_ERR_INTERN: internal error
> > [rs0.informatik.hs-fulda.de:7718] *** MPI_ERRORS_ARE_FATAL (processes
> > in this communicator will now abort,
> > [rs0.informatik.hs-fulda.de:7718] *** and potentially your MPI job)
> > [sunpc4.informatik.hs-fulda.de:07900] 1 more process has sent help
> > message help-mpi-errors.txt / mpi_errors_are_fatal
> > [sunpc4.informatik.hs-fulda.de:07900] Set MCA parameter
> > "orte_base_help_aggregate" to 0 to see all help / error messages
> >
> >
> > Please let me know if I can provide anything else to track these errors.
> > Thank you very much for any help in advance.
> >
> >
> > Kind regards
> >
> > Siegmar
> >
> > _______________________________________________
> > users mailing list
> > users_at_[hidden]
> > http://www.open-mpi.org/mailman/listinfo.cgi/users
> >


[sunpc4.informatik.hs-fulda.de][[4083,1],2][../../../../../openmpi-1.9a1r27362/ompi/mca/btl/sctp/btl_sctp_proc.c:143:mca_btl_sctp_proc_create] mca_base_modex_recv: failed with return value=-13
[rs0.informatik.hs-fulda.de][[4083,1],3][../../../../../openmpi-1.9a1r27362/ompi/mca/btl/sctp/btl_sctp_proc.c:143:mca_btl_sctp_proc_create] mca_base_modex_recv: failed with return value=-13
[rs0.informatik.hs-fulda.de][[4083,1],3][../../../../../openmpi-1.9a1r27362/ompi/mca/btl/sctp/btl_sctp_proc.c:143:mca_btl_sctp_proc_create] mca_base_modex_recv: failed with return value=-13
[rs0.informatik.hs-fulda.de][[4083,1],3][../../../../../openmpi-1.9a1r27362/ompi/mca/btl/sctp/btl_sctp_proc.c:143:mca_btl_sctp_proc_create] mca_base_modex_recv: failed with return value=-13

Now 3 slave tasks are sending their environment.

Environment from task 1:
  message type: 3
  msg length: 3911 characters
  message:
    hostname: linpc4
    operating system: Linux
    release: 3.1.9-1.4-desktop
    processor: x86_64
    PATH
                       /usr/local/eclipse-3.6.1
                       /usr/local/NetBeans-4.0/bin
                       /usr/local/jdk1.7.0_07-64/bin
                       /usr/local/apache-ant-1.6.2/bin
                       /usr/local/icc-9.1/idb/bin
                       /usr/local/icc-9.1/cc/bin
                       /usr/local/icc-9.1/fc/bin
                       /usr/local/gcc-4.7.1/bin
                       /opt/solstudio12.3/bin
                       /usr/local/bin
                       /usr/local/ssl/bin
                       /usr/local/pgsql/bin
                       /bin
                       /usr/bin
                       /usr/X11R6/bin
                       /usr/local/teTeX-1.0.7/bin/i586-pc-linux-gnu
                       /usr/local/bluej-2.1.2
                       /usr/local/openmpi-1.9_64_cc/bin
                       /home/fd1026/Linux/x86_64/bin
                       .
                       /usr/sbin
    LD_LIBRARY_PATH_32
                       /usr/lib
                       /usr/local/jdk1.7.0_07-64/jre/lib/i386
                       /usr/local/gcc-4.7.1/lib
                       /usr/local/gcc-4.7.1/libexec/gcc/x86_64-unknown-linux-gnu/4.7.1/32
                       /usr/local/gcc-4.7.1/lib/gcc/x86_64-unknown-linux-gnu/4.7.1/32
                       /usr/local/lib
                       /usr/local/ssl/lib
                       /lib
                       /usr/lib
                       /usr/X11R6/lib
                       /usr/local/openmpi-1.9_64_cc/lib
                       /home/fd1026/Linux/x86_64/lib
    LD_LIBRARY_PATH_64
                       /usr/lib64
                       /usr/local/jdk1.7.0_07-64/jre/lib/amd64
                       /usr/local/gcc-4.7.1/lib64
                       /usr/local/gcc-4.7.1/libexec/gcc/x86_64-unknown-linux-gnu/4.7.1
                       /usr/local/gcc-4.7.1/lib/gcc/x86_64-unknown-linux-gnu/4.7.1
                       /usr/local/lib64
                       /usr/local/ssl/lib64
                       /usr/lib64
                       /usr/X11R6/lib64
                       /usr/local/openmpi-1.9_64_cc/lib64
                       /home/fd1026/Linux/x86_64/lib64
    LD_LIBRARY_PATH
                       /usr/lib
                       /usr/local/jdk1.7.0_07-64/jre/lib/i386
                       /usr/local/gcc-4.7.1/lib
                       /usr/local/gcc-4.7.1/libexec/gcc/x86_64-unknown-linux-gnu/4.7.1/32
                       /usr/local/gcc-4.7.1/lib/gcc/x86_64-unknown-linux-gnu/4.7.1/32
                       /usr/local/lib
                       /usr/local/ssl/lib
                       /lib
                       /usr/lib
                       /usr/X11R6/lib
                       /usr/local/openmpi-1.9_64_cc/lib
                       /usr/lib64
                       /usr/local/jdk1.7.0_07-64/jre/lib/amd64
                       /usr/local/gcc-4.7.1/lib64
                       /usr/local/gcc-4.7.1/libexec/gcc/x86_64-unknown-linux-gnu/4.7.1
                       /usr/local/gcc-4.7.1/lib/gcc/x86_64-unknown-linux-gnu/4.7.1
                       /usr/local/lib64
                       /usr/local/ssl/lib64
                       /usr/lib64
                       /usr/X11R6/lib64
                       /usr/local/openmpi-1.9_64_cc/lib64
                       /home/fd1026/Linux/x86_64/lib64
    CLASSPATH
                       /usr/local/junit4.10
                       /usr/local/junit4.10/junit-4.10.jar
                       //usr/local/jdk1.7.0_07-64/j3d/lib/ext/j3dcore.jar
                       //usr/local/jdk1.7.0_07-64/j3d/lib/ext/j3dutils.jar
                       //usr/local/jdk1.7.0_07-64/j3d/lib/ext/vecmath.jar
                       /usr/local/javacc-5.0/javacc.jar
                       .

Environment from task 2:
  message type: 3
  msg length: 4196 characters
  message:
    hostname: sunpc4.informatik.hs-fulda.de
    operating system: SunOS
    release: 5.10
    processor: i86pc
    PATH
                       /usr/local/eclipse-3.6.1
                       /usr/local/NetBeans-4.0/bin
                       /usr/local/jdk1.7.0_07/bin/amd64
                       /usr/local/apache-ant-1.6.2/bin
                       /usr/local/gcc-4.7.1/bin
                       /opt/solstudio12.3/bin
                       /usr/local/bin
                       /usr/local/ssl/bin
                       /usr/local/pgsql/bin
                       /usr/bin
                       /usr/openwin/bin
                       /usr/dt/bin
                       /usr/ccs/bin
                       /usr/sfw/bin
                       /opt/sfw/bin
                       /usr/ucb
                       /usr/lib/lp/postscript
                       /usr/local/teTeX-1.0.7/bin/i386-pc-solaris2.10
                       /usr/local/bluej-2.1.2
                       /usr/local/openmpi-1.9_64_cc/bin
                       /home/fd1026/SunOS/x86_64/bin
                       .
                       /usr/sbin
    LD_LIBRARY_PATH_32
                       /usr/lib
                       /usr/local/jdk1.7.0_07/jre/lib/i386
                       /usr/local/gcc-4.7.1/lib
                       /usr/local/gcc-4.7.1/lib/gcc/i386-pc-solaris2.10/4.7.1
                       /usr/local/lib
                       /usr/local/ssl/lib
                       /usr/local/oracle
                       /usr/local/pgsql/lib
                       /usr/lib
                       /usr/openwin/lib
                       /usr/openwin/server/lib
                       /usr/dt/lib
                       /usr/X11R6/lib
                       /usr/ccs/lib
                       /usr/sfw/lib
                       /opt/sfw/lib
                       /usr/ucblib
                       /usr/local/openmpi-1.9_64_cc/lib
                       /home/fd1026/SunOS/x86_64/lib
    LD_LIBRARY_PATH_64
                       /usr/lib/amd64
                       /usr/local/jdk1.7.0_07/jre/lib/amd64
                       /usr/local/gcc-4.7.1/lib/amd64
                       /usr/local/gcc-4.7.1/lib/gcc/i386-pc-solaris2.10/4.7.1/amd64
                       /usr/local/lib/amd64
                       /usr/local/ssl/lib/amd64
                       /usr/local/lib64
                       /usr/lib/amd64
                       /usr/openwin/lib/amd64
                       /usr/openwin/server/lib/amd64
                       /usr/dt/lib/amd64
                       /usr/X11R6/lib/amd64
                       /usr/ccs/lib/amd64
                       /usr/sfw/lib/amd64
                       /opt/sfw/lib/amd64
                       /usr/ucblib/amd64
                       /usr/local/openmpi-1.9_64_cc/lib64
                       /home/fd1026/SunOS/x86_64/lib64
    LD_LIBRARY_PATH
                       /usr/lib/amd64
                       /usr/local/jdk1.7.0_07/jre/lib/amd64
                       /usr/local/gcc-4.7.1/lib/amd64
                       /usr/local/gcc-4.7.1/lib/gcc/i386-pc-solaris2.10/4.7.1/amd64
                       /usr/local/lib/amd64
                       /usr/local/ssl/lib/amd64
                       /usr/local/lib64
                       /usr/lib/amd64
                       /usr/openwin/lib/amd64
                       /usr/openwin/server/lib/amd64
                       /usr/dt/lib/amd64
                       /usr/X11R6/lib/amd64
                       /usr/ccs/lib/amd64
                       /usr/sfw/lib/amd64
                       /opt/sfw/lib/amd64
                       /usr/ucblib/amd64
                       /usr/local/openmpi-1.9_64_cc/lib64
                       /home/fd1026/SunOS/x86_64/lib64
    CLASSPATH
                       /usr/local/junit4.10
                       /usr/local/junit4.10/junit-4.10.jar
                       //usr/local/jdk1.7.0_07/j3d/lib/ext/j3dcore.jar
                       //usr/local/jdk1.7.0_07/j3d/lib/ext/j3dutils.jar
                       //usr/local/jdk1.7.0_07/j3d/lib/ext/vecmath.jar
                       /usr/local/javacc-5.0/javacc.jar
                       .

Environment from task 3:
  message type: 3
  msg length: 4394 characters
  message:
    hostname: rs0.informatik.hs-fulda.de
    operating system: SunOS
    release: 5.10
    processor: sun4u
    PATH
                       /usr/local/eclipse-3.6.1
                       /usr/local/NetBeans-4.0/bin
                       /usr/local/jdk1.7.0_07/bin/sparcv9
                       /usr/local/apache-ant-1.6.2/bin
                       /usr/local/gcc-4.7.1/bin
                       /opt/solstudio12.3/bin
                       /usr/local/bin
                       /usr/local/ssl/bin
                       /usr/local/pgsql/bin
                       /usr/bin
                       /usr/openwin/bin
                       /usr/dt/bin
                       /usr/ccs/bin
                       /usr/sfw/bin
                       /opt/sfw/bin
                       /usr/ucb
                       /usr/xpg4/bin
                       /usr/local/teTeX-1.0.7/bin/sparc-sun-solaris2.10
                       /usr/local/bluej-2.1.2
                       /usr/local/openmpi-1.9_64_cc/bin
                       /home/fd1026/SunOS/sparc/bin
                       .
                       /usr/sbin
    LD_LIBRARY_PATH_32
                       /usr/lib
                       /usr/local/jdk1.7.0_07/jre/lib/sparc
                       /usr/local/gcc-4.7.1/lib
                       /usr/local/gcc-4.7.1/lib/gcc/sparc-sun-solaris2.10/4.7.1
                       /usr/local/lib
                       /usr/local/ssl/lib
                       /usr/local/oracle
                       /usr/local/pgsql/lib
                       /lib
                       /usr/lib
                       /usr/openwin/lib
                       /usr/dt/lib
                       /usr/X11R6/lib
                       /usr/ccs/lib
                       /usr/sfw/lib
                       /opt/sfw/lib
                       /usr/ucblib
                       /usr/local/openmpi-1.9_64_cc/lib
                       /home/fd1026/SunOS/sparc/lib
    LD_LIBRARY_PATH_64
                       /usr/lib/sparcv9
                       /usr/local/jdk1.7.0_07/jre/lib/sparcv9
                       /usr/local/gcc-4.7.1/lib/sparcv9
                       /usr/local/gcc-4.7.1/lib/gcc/sparc-sun-solaris2.10/4.7.1/sparcv9
                       /usr/local/lib/sparcv9
                       /usr/local/ssl/lib/sparcv9
                       /usr/local/lib64
                       /usr/local/oracle/sparcv9
                       /usr/local/pgsql/lib/sparcv9
                       /lib/sparcv9
                       /usr/lib/sparcv9
                       /usr/openwin/lib/sparcv9
                       /usr/dt/lib/sparcv9
                       /usr/X11R6/lib/sparcv9
                       /usr/ccs/lib/sparcv9
                       /usr/sfw/lib/sparcv9
                       /opt/sfw/lib/sparcv9
                       /usr/ucblib/sparcv9
                       /usr/local/openmpi-1.9_64_cc/lib64
                       /home/fd1026/SunOS/sparc/lib64
    LD_LIBRARY_PATH
                       /usr/lib/sparcv9
                       /usr/local/jdk1.7.0_07/jre/lib/sparcv9
                       /usr/local/gcc-4.7.1/lib/sparcv9
                       /usr/local/gcc-4.7.1/lib/gcc/sparc-sun-solaris2.10/4.7.1/sparcv9
                       /usr/local/lib/sparcv9
                       /usr/local/ssl/lib/sparcv9
                       /usr/local/lib64
                       /usr/local/oracle/sparcv9
                       /usr/local/pgsql/lib/sparcv9
                       /lib/sparcv9
                       /usr/lib/sparcv9
                       /usr/openwin/lib/sparcv9
                       /usr/dt/lib/sparcv9
                       /usr/X11R6/lib/sparcv9
                       /usr/ccs/lib/sparcv9
                       /usr/sfw/lib/sparcv9
                       /opt/sfw/lib/sparcv9
                       /usr/ucblib/sparcv9
                       /usr/local/openmpi-1.9_64_cc/lib64
                       /home/fd1026/SunOS/sparc/lib
    CLASSPATH
                       /usr/local/junit4.10
                       /usr/local/junit4.10/junit-4.10.jar
                       //usr/local/jdk1.7.0_07/j3d/lib/ext/j3dcore.jar
                       //usr/local/jdk1.7.0_07/j3d/lib/ext/j3dutils.jar
                       //usr/local/jdk1.7.0_07/j3d/lib/ext/vecmath.jar
                       /usr/local/javacc-5.0/javacc.jar
                       .