Open MPI logo

Open MPI User's Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Open MPI User's mailing list

Subject: [OMPI users] openmpi-1.2.5 and globus-4.0.5
From: Christoph Spielmann (cspielma_at_[hidden])
Date: 2008-03-10 11:01:32


Hi everybody!

I try to get OpenMPI and Globus to cooperate. These are the steps i
executed in order to get OpenMPI working:

   1. export PATH=/opt/openmpi/bin/:$PATH
   2. /opt/globus/setup/globus/setup-globus-job-manager-fork
      checking for mpiexec... /opt/openmpi/bin//mpiexec
      checking for mpirun... /opt/openmpi/bin//mpirun
      find-fork-tools: creating ./config.status
      config.status: creating fork.pm
   3. restart VDT (includes GRAM, WSGRAM, mysql, rls...)

As you can see the necessary OpenMPI-executables are recognized
correctly by setup-globus-job-manager-fork. But when i actually try to
execute a simple mpi-program using globus-job-run i get this:

globus-job-run localhost -x '(jobType=mpi)' -np 2 -s ./hypercube 0
[hydra:10168] [0,0,0] ORTE_ERROR_LOG: Error in file
runtime/orte_init_stage1.c at line 312
--------------------------------------------------------------------------
It looks like orte_init failed for some reason; your parallel process is
likely to abort. There are many reasons that a parallel process can
fail during orte_init; some of which are due to configuration or
environment problems. This failure appears to be an internal failure;
here's some additional information (which may only be relevant to an
Open MPI developer):

  orte_pls_base_select failed
  --> Returned value -1 instead of ORTE_SUCCESS

--------------------------------------------------------------------------
[hydra:10168] [0,0,0] ORTE_ERROR_LOG: Error in file
runtime/orte_system_init.c at line 42
[hydra:10168] [0,0,0] ORTE_ERROR_LOG: Error in file runtime/orte_init.c
at line 52
--------------------------------------------------------------------------
Open RTE was unable to initialize properly. The error occured while
attempting to orte_init(). Returned value -1 instead of ORTE_SUCCESS.
--------------------------------------------------------------------------

The MPI-program itself is okey:

which mpirun && mpirun -np 2 hypercube 0
/opt/openmpi/bin/mpirun
Process 0 received broadcast message 'MPI_Broadcast with hypercube
topology' from Process 0
Process 1 received broadcast message 'MPI_Broadcast with hypercube
topology' from Process 0

 From what i read in the mailing list i think that something is wrong
with the pls and globus. But i have no idea what could be wrong not to
speak of how it could be fixed ;). so if someone would have an idea how
this could be fixed, i'd be glad to hear it.

Regards,

Christoph