One thing stands out right away: why are you specifying a hostfile? Did you remember to configure OMPI with --with-tm so we launch via Torque? If not, then you could hit issues as you are actually attempting to launch via ssh, which has implications on a Torque-based system.

On Mar 29, 2012, at 8:51 AM, Raju wrote:

Hi Team,

I am using Qlogic Infiniband and Openmpi-1.5.3. I can able to run the jobs by CLI without any issues, but when iam submitting over torque scheduler facing the below issue.

I am facing issue while submitting the jobs through Torque scheduler. Error file is attached

Overview of the problem: initialization failure on /dev/ipath (err=23)
PSM was unable to open an endpoint. Please make sure that the network link is
active on the node and the hardware is functioning.


  Error: Failure in initializing endpoint

I gone through the link for solution, same followed but no luck.

I exported the value in my input submit script file as export PSM_SHAREDCONTEXTS_MAX=16, and submitted the job.

Sample inputfile is

#PBS -N matmul
#PBS -l nodes=1:ppn=1
nprocs=`expr ${node} \* ${ppn}`


mpirun -np ${nprocs} --hostfile $PBS_NODEFILE  /home/khan/a.out < /home/khan/iter


Please let me know I doing correct or not ? and suggest me for best out ?


Bhagya Raju K

devel mailing list