I'm having a problem running OpenMPI under Torque. It complains like there is a command syntax problem, but the three variations below are all correct, best I can tell using mpirun -help. The environment in which the command executes, i.e. PATH and LD_LIBRARY_PATH, is correct. Torque is 2.3.x. OpenMPI is 1.2.8. OFED is 1.4.
Somewhere in the FAQ I had read that you must not give -machinefile under Torque with OpenMPI 1.2.8 and you did not need to give -np. That's why I tried variation 3 below without either of these options, but it still fails.
Thanks for any help
/usr/mpi/intel/openmpi-1.2.8/bin/mpirun -np 28 /tmp/43.fwnaeglingio/falconv4_ibm_openmpi -cycles 100 -ri restart.0 -ro /tmp/43.fwnaeglingio/restart.0
--------------------------------------------------------------------------
Failed to find the following executable:
Host: n8n26
Executable: -p
Cannot continue.
mpirun --prefix /usr/mpi/intel/openmpi-1.2.8 --machinefile /var/spool/torque/aux/45.fwnaeglingio -np 28 --mca btl ^tcp --mca mpi_leave_pinned 1 --mca mpool_base_use_mem_hooks 1 -x LD_LIBRARY_PATH -x MPI_ENVIRONMENT /tmp/45.fwnaeglingio/falconv4_ibm_openmpi -cycles 100 -ri restart.0 -ro /tmp/45.fwnaeglingio/restart.0
--------------------------------------------------------------------------
Failed to find or execute the following executable:
Host: n8n27
Executable: --prefix /usr/mpi/intel/openmpi-1.2.8
Cannot continue.
/usr/mpi/intel/openmpi-1.2.8/bin/mpirun -x LD_LIBRARY_PATH -x MPI_ENVIRONMENT=1 /tmp/47.fwnaeglingio/falconv4_ibm_openmpi -cycles 100 -ri restart.0 -ro /tmp/47.fwnaeglingio/restart.0
--------------------------------------------------------------------------
Failed to find the following executable:
Host: n8n27
Executable: -
Cannot continue.
|