Rolf Vandevaart wrote:
>> PMGR_COLLECTIVE ERROR: unitialized MPI task: Missing required
>> environment variable: MPIRUN_RANK
>> PMGR_COLLECTIVE ERROR: PMGR_COLLECTIVE ERROR: unitialized MPI task:
>> Missing required environment variable: MPIRUN_RANK
> I do not recognize these errors as part of Open MPI. A google search
> showed they might be coming from MVAPICH. Is there a chance we are
> using Open MPI to launch the jobs (via Open MPI mpirun) but we are
> actually launching an application that is linked to MVAPICH?
You are correct. I was trying to run the MVAPICH compiled test program.
With an OpenMPI compiled test, I do get an extra line of output with the
verbose flag. The program just hangs at that point.
[muno_at_compute-6-30 ~]$ which mpirun
[muno_at_compute-6-30 ~]$ldd a.out
libmpi.so.0 => /share/apps/opt/openmpi_pgi/lib/libmpi.so.0
mpirun -np $NSLOTS -mca pls_gridengine_verbose 1 a.out
Starting server daemon at host "compute-6-25.local"
Starting server daemon at host "compute-1-1.local"
Server daemon successfully started with task id "1.compute-6-25"
error: commlib error: access denied (client IP resolved to host name "".
This is not identical to clients host name "")
error: executing task of job 12144 failed: failed sending task to
execd_at_compute-1-1.local: can't find connection
[compute-6-25.local:10810] ERROR: A daemon on node compute-1-1.local
failed to start as expected.
[compute-6-25.local:10810] ERROR: There may be more information
[compute-6-25.local:10810] ERROR: the 'qstat -t' command on the Grid
[compute-6-25.local:10810] ERROR: If the problem persists, please
[compute-6-25.local:10810] ERROR: Grid Engine PE job
[compute-6-25.local:10810] ERROR: The daemon exited unexpectedly with
Establishing /usr/bin/ssh session to host compute-6-25.local ...