Open MPI logo

Open MPI User's Mailing List Archives

  |   Home   |   Support   |   FAQ   |  

This web mail archive is frozen.

This page is part of a frozen web archive of this mailing list.

You can still navigate around this archive, but know that no new mails have been added to it since July of 2016.

Click here to be taken to the new web archives of this list; it includes all the mails that are in this frozen archive plus all new mails that have been sent to the list since it was migrated to the new archives.

Subject: Re: [OMPI users] EXTERNAL: Re: Problem with shell when launching jobs with OpenMPI 1.6.5 rsh
From: Blosch, Edwin L (edwin.l.blosch_at_[hidden])
Date: 2014-04-07 16:36:14

I guess this is not OpenMPI related anymore. I can repeat the essential problem interactively:

% echo $SHELL

% echo $SHLVL

% cat hello
echo Hello

% /bin/bash hello

% /bin/csh hello

% . hello
/bin/.: Permission denied

I think I need to hope the administrator can fix it. Sorry for the bother...

-----Original Message-----
From: users [mailto:users-bounces_at_[hidden]] On Behalf Of Reuti
Sent: Monday, April 07, 2014 3:27 PM
To: Open MPI Users
Subject: EXTERNAL: Re: [OMPI users] Problem with shell when launching jobs with OpenMPI 1.6.5 rsh

Am 07.04.2014 um 22:04 schrieb Blosch, Edwin L:

> I am submitting a job for execution under SGE. My default shell is /bin/csh.

Where - in SGE or on the interactive command line you get?

> The script that is submitted has #!/bin/bash at the top. The script runs on the 1st node allocated to the job. The script runs a Python wrapper that ultimately issues the following mpirun command:
> /apps/local/test/openmpi/bin/mpirun --machinefile mpihosts.914 -np 48 -x LD_LIBRARY_PATH -x MPI_ENVIRONMENT=1 --mca btl ^tcp --mca shmem_mmap_relocate_backing_file -1 --bind-to-core --bycore --mca orte_rsh_agent /usr/bin/rsh --mca plm_rsh_disable_qrsh 1 /apps/local/test/solver/bin/solver_openmpi -cycles 50 -ri restart.0 -i flow.inp >& output
> Just so there's no confusion, OpenMPI is built without support for SGE. It should be using rsh to launch.
> There are 4 nodes involved (each 12 cores, 48 processes total). In the output file, I see 3 sets of messages as shown below. I assume I am seeing 1 set of messages for each of the 3 remote nodes where processes need to be launched:
> /bin/.: Permission denied.
> OPAL_PREFIX=/apps/local/falcon2014/openmpi: Command not found.
> export: Command not found.
> PATH=/apps/local/test/openmpi/bin:/bin:/usr/bin:/usr/ccs/bin:/usr/local/bin:/usr/openwin/bin:/usr/local/etc:/home/bloscel/bin:/usr/ucb:/usr/bsd: Command not found.
> export: Command not found.
> LD_LIBRARY_PATH: Undefined variable.

This looks really like csh is trying to interpret bash commands. In case SGE's queue is set up to have "shell_start_mode posix_compliant" set, the first line of the script is not treated in a special way. You can change the shell only by "-S /bin/bash" then (or redefine the queue to have "shell_start_mode unix_behavior" set and get the expected behavior when starting a script [side effect: the shell is not started as login shell any longer. See also `man sge_conf` => "login_shells" for details]).

BTW: you don't want a tight integration by intention?

-- Reuti

> These look like errors you get when csh is trying to parse commands intended for bash.
> Does anyone know what may be going on here?
> Thanks,
> Ed
> _______________________________________________
> users mailing list
> users_at_[hidden]

users mailing list