Open MPI logo

Open MPI User's Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Open MPI User's mailing list

Subject: Re: [OMPI users] ompi + bash + GE + modules
From: Gustavo Correa (gus_at_[hidden])
Date: 2012-01-11 13:24:37


Hi Mark

I wonder if you need to initialize the module command environment inside your SGE
bash submission script:

$MODULESHOME/init/<shell>

where <shell> is bash in this case. See 'man module' for more details.

This would be before you actually invoke the module command:

module load openmpi

I am guessing your users' default shell is csh, and they may perhaps have a 'csh'
module environment initialized by their .cshrc, but the job submission script is in bash.

Anyway, we use Torque, not SGE, so this is just a guess.

I hope it helps,
Gus Correa

On Jan 11, 2012, at 12:42 PM, Mark Suhovecky wrote:

>
> Hi-
>
> We run OpenMPI 1.4.3 on RHEL5 in a cluster environment.
> We use Univa Grid Engine 8.0.1 (an SGE spinoff) for job submission.
> We've just recently begun supporting the bash shell for submitted jobs,
> and are seeing a problem with submitted MPI jobs.
>
> Our software environment is manged with Modules package (version 3.2.8),
> so a typical job submission looks something like this
>
> #!/bin/bash
> #$ <some GE directives>
>
> module load ompi
>
> mpiexec
>
> when the mpiexec is run, we'll see the following errors
>
>
> bash: module: line 1: syntax error: unexpected end of file
> bash: error importing function definition for `module'
>
> The module int file contains this function, which is what I'm assuming all the fuss is about:
>
> module() { eval `/opt/crc/Modules/$MODULE_VERSION/bin/modulecmd bash $*`; }
> export -f module
>
> There will be multiple instances of the error generated- for example, if I'm
> running a 48 core mpi-12 job spread across 4 machines,
> I'll see these errors printed 3 times. I don't see these errors
> on single-machine submitted jobs.
>
> I've found posts for this error on bash, modules, and SGE lists, and have
> tried a number of suggested workarounds that all involve changing how I
> source modules (in /etc/profile.d, .bash_profile, via BASH_ENV), but
> none have gotten rid of this error.
>
> Since we only see this problem with MPI, I figured it couldn't hurt to post
> here and see if any of you have had this symptom, and what your solution was.
>
> I should mention that running a submitted MPI job under csh works just fine.
>
> Thanks for any help,
>
> Mark
>
> Mark Suhovecky
> HPC System Administrator
> Center for Research Computing
> University of Notre Dame
> suhovecky at nd.edu
> _______________________________________________
> users mailing list
> users_at_[hidden]
> http://www.open-mpi.org/mailman/listinfo.cgi/users