Hi-
We run OpenMPI 1.4.3 on RHEL5 in a cluster environment.
We use Univa Grid Engine 8.0.1 (an SGE spinoff) for job submission.
We've just recently begun supporting the bash shell for submitted jobs,
and are seeing a problem with submitted MPI jobs.
Our software environment is manged with Modules package (version 3.2.8),
so a typical job submission looks something like this
#!/bin/bash
#$ <some GE directives>
module load ompi
mpiexec
when the mpiexec is run, we'll see the following errors
bash: module: line 1: syntax error: unexpected end of file
bash: error importing function definition for `module'
The module int file contains this function, which is what I'm assuming all the fuss is about:
module() { eval `/opt/crc/Modules/$MODULE_VERSION/bin/modulecmd bash $*`; }
export -f module
There will be multiple instances of the error generated- for example, if I'm
running a 48 core mpi-12 job spread across 4 machines,
I'll see these errors printed 3 times. I don't see these errors
on single-machine submitted jobs.
I've found posts for this error on bash, modules, and SGE lists, and have
tried a number of suggested workarounds that all involve changing how I
source modules (in /etc/profile.d, .bash_profile, via BASH_ENV), but
none have gotten rid of this error.
Since we only see this problem with MPI, I figured it couldn't hurt to post
here and see if any of you have had this symptom, and what your solution was.
I should mention that running a submitted MPI job under csh works just fine.
Thanks for any help,
Mark
Mark Suhovecky
HPC System Administrator
Center for Research Computing
University of Notre Dame
suhovecky at nd.edu
|