Open MPI logo

Open MPI User's Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Open MPI User's mailing list

From: Adam C Powell IV (hazelsct_at_[hidden])
Date: 2007-07-18 11:49:35


As mentioned, I'm running in a chroot environment, so rsh and ssh won't
work: "rsh localhost" will rsh into the primary local host environment,
not the chroot, which will fail.

[The purpose is to be able to build and test MPI programs in the Debian
unstable distribution, without upgrading the whole machine to unstable.
Though most machines I use for this purpose run Debian stable or
testing, the machine I'm currently using runs a very old Fedora, for
which I don't think OpenMPI is available.]

With MPICH, mpirun -np 1 just runs the new process in the current
context, without rsh/ssh, so it works in a chroot. Does OpenMPI not
support this functionality?

Thanks,
Adam

On Wed, 2007-07-18 at 11:09 -0400, Tim Prins wrote:
> This is strange. I assume that you what to use rsh or ssh to launch the
> processes?
>
> If you want to use ssh, does "which ssh" find ssh? Similarly, if you
> want to use rsh, does "which rsh" find rsh?
>
> Thanks,
>
> Tim
>
> Adam C Powell IV wrote:
> > On Wed, 2007-07-18 at 09:50 -0400, Tim Prins wrote:
> >> Adam C Powell IV wrote:
> >>> Greetings,
> >>>
> >>> I'm running the Debian package of OpenMPI in a chroot (with /proc
> >>> mounted properly), and orte_init is failing as follows:
> >>> [snip]
> >>> What could be wrong? Does orterun not run in a chroot environment?
> >>> What more can I do to investigate further?
> >> Try running mpirun with the added options:
> >> -mca orte_debug 1 -mca pls_base_verbose 20
> >>
> >> Then send the output to the list.
> >
> > Thanks! Here's the output:
> >
> > $ orterun -mca orte_debug 1 -mca pls_base_verbose 20 -np 1 uptime
> > [new-host-3:19201] mca: base: components_open: Looking for pls components
> > [new-host-3:19201] mca: base: components_open: distilling pls components
> > [new-host-3:19201] mca: base: components_open: accepting all pls components
> > [new-host-3:19201] mca: base: components_open: opening pls components
> > [new-host-3:19201] mca: base: components_open: found loaded component gridengine[new-host-3:19201] mca: base: components_open: component gridengine open function successful
> > [new-host-3:19201] mca: base: components_open: found loaded component proxy
> > [new-host-3:19201] mca: base: components_open: component proxy open function successful
> > [new-host-3:19201] mca: base: components_open: found loaded component rsh
> > [new-host-3:19201] mca: base: components_open: component rsh open function successful
> > [new-host-3:19201] mca: base: components_open: found loaded component slurm
> > [new-host-3:19201] mca: base: components_open: component slurm open function successful
> > [new-host-3:19201] orte:base:select: querying component gridengine
> > [new-host-3:19201] pls:gridengine: NOT available for selection
> > [new-host-3:19201] orte:base:select: querying component proxy
> > [new-host-3:19201] orte:base:select: querying component rsh
> > [new-host-3:19201] orte:base:select: querying component slurm
> > [new-host-3:19201] [0,0,0] ORTE_ERROR_LOG: Error in file runtime/orte_init_stage1.c at line 312
> > --------------------------------------------------------------------------
> > It looks like orte_init failed for some reason; your parallel process is
> > likely to abort. There are many reasons that a parallel process can
> > fail during orte_init; some of which are due to configuration or
> > environment problems. This failure appears to be an internal failure;
> > here's some additional information (which may only be relevant to an
> > Open MPI developer):
> >
> > orte_pls_base_select failed
> > --> Returned value -1 instead of ORTE_SUCCESS
> >
> > --------------------------------------------------------------------------
> > [new-host-3:19201] [0,0,0] ORTE_ERROR_LOG: Error in file runtime/orte_system_init.c at line 42
> > [new-host-3:19201] [0,0,0] ORTE_ERROR_LOG: Error in file runtime/orte_init.c at line 52
> > --------------------------------------------------------------------------
> > Open RTE was unable to initialize properly. The error occured while
> > attempting to orte_init(). Returned value -1 instead of ORTE_SUCCESS.
> > --------------------------------------------------------------------------
> >
> > -Adam

-- 
GPG fingerprint: D54D 1AEE B11C CE9B A02B  C5DD 526F 01E8 564E E4B6
Welcome to the best software in the world today cafe!
http://www.take6.com/albums/greatesthits.html