Open MPI logo

Open MPI User's Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Open MPI User's mailing list

Subject: Re: [OMPI users] openmpi-1.2.5 and globus-4.0.5
From: Jeff Squyres (jsquyres_at_[hidden])
Date: 2008-03-14 19:40:45


I don't know if anyone has tried to run Open MPI with globus before.

One requirement that Open MPI currently has is that all nodes must be
reachable to each other via TCP. Is that true in your globus
environment?

On Mar 10, 2008, at 11:01 AM, Christoph Spielmann wrote:

> Hi everybody!
>
> I try to get OpenMPI and Globus to cooperate. These are the steps i
> executed in order to get OpenMPI working:
>
> • export PATH=/opt/openmpi/bin/:$PATH
> • /opt/globus/setup/globus/setup-globus-job-manager-fork
> checking for mpiexec... /opt/openmpi/bin//mpiexec
> checking for mpirun... /opt/openmpi/bin//mpirun
> find-fork-tools: creating ./config.status
> config.status: creating fork.pm
> • restart VDT (includes GRAM, WSGRAM, mysql, rls...)
> As you can see the necessary OpenMPI-executables are recognized
> correctly by setup-globus-job-manager-fork. But when i actually try
> to execute a simple mpi-program using globus-job-run i get this:
>
> globus-job-run localhost -x '(jobType=mpi)' -np 2 -s ./hypercube 0
> [hydra:10168] [0,0,0] ORTE_ERROR_LOG: Error in file runtime/
> orte_init_stage1.c at line 312
> --------------------------------------------------------------------------
> It looks like orte_init failed for some reason; your parallel
> process is
> likely to abort. There are many reasons that a parallel process can
> fail during orte_init; some of which are due to configuration or
> environment problems. This failure appears to be an internal failure;
> here's some additional information (which may only be relevant to an
> Open MPI developer):
>
> orte_pls_base_select failed
> --> Returned value -1 instead of ORTE_SUCCESS
>
> --------------------------------------------------------------------------
> [hydra:10168] [0,0,0] ORTE_ERROR_LOG: Error in file runtime/
> orte_system_init.c at line 42
> [hydra:10168] [0,0,0] ORTE_ERROR_LOG: Error in file runtime/
> orte_init.c at line 52
> --------------------------------------------------------------------------
> Open RTE was unable to initialize properly. The error occured while
> attempting to orte_init(). Returned value -1 instead of ORTE_SUCCESS.
> --------------------------------------------------------------------------
>
> The MPI-program itself is okey:
>
> which mpirun && mpirun -np 2 hypercube 0
> /opt/openmpi/bin/mpirun
> Process 0 received broadcast message 'MPI_Broadcast with hypercube
> topology' from Process 0
> Process 1 received broadcast message 'MPI_Broadcast with hypercube
> topology' from Process 0
>
>
> >From what i read in the mailing list i think that something is
> wrong with the pls and globus. But i have no idea what could be
> wrong not to speak of how it could be fixed ;). so if someone would
> have an idea how this could be fixed, i'd be glad to hear it.
>
> Regards,
>
> Christoph
> _______________________________________________
> users mailing list
> users_at_[hidden]
> http://www.open-mpi.org/mailman/listinfo.cgi/users

-- 
Jeff Squyres
Cisco Systems