Open MPI logo

Open MPI Development Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Development mailing list

Subject: Re: [OMPI devel] Openmpi-1.5.3 issue " initialization failure on /dev/ipath (err=23)"
From: Raju (brajuk_at_[hidden])
Date: 2012-03-29 11:26:02


Hi Ralph,

I recompiled OMPI with --with-tm option, but still same issue... I changed
the input file as below... Please let me know what i have to fine tune and
verify

#!/bin/bash
#PBS -N matmul
#PBS -l nodes=1:ppn=1
node=1
ppn=1
nprocs=`expr ${node} \* ${ppn}`
export PSM_SHAREDCONTEXTS_MAX=16

mpirun -np ${nprocs} /home/khan/a.out < /home/khan/iter

Regards,
Raju...

On Thu, Mar 29, 2012 at 8:49 PM, Raju <brajuk_at_[hidden]> wrote:

> Hi Ralph,
>
> Thanks for the very quick response, I did compiled with -tm option i am
> doing now, once it done i will revert back...
>
> Thanks
> Raju..
>
>
> On Thu, Mar 29, 2012 at 8:29 PM, Ralph Castain <rhc_at_[hidden]> wrote:
>
>> One thing stands out right away: why are you specifying a hostfile? Did
>> you remember to configure OMPI with --with-tm so we launch via Torque? If
>> not, then you could hit issues as you are actually attempting to launch via
>> ssh, which has implications on a Torque-based system.
>>
>>
>> On Mar 29, 2012, at 8:51 AM, Raju wrote:
>>
>> Hi Team,
>>
>> I am using Qlogic Infiniband and Openmpi-1.5.3. I can able to run the
>> jobs by CLI without any issues, but when iam submitting over torque
>> scheduler facing the below issue.
>>
>> I am facing issue while submitting the jobs through Torque scheduler.
>> Error file is attached
>>
>> *Overview of the problem:*
>> node1.ibab.ac.in.5910Driver initialization failure on /dev/ipath (err=23)
>> --------------------------------------------------------------------------
>> PSM was unable to open an endpoint. Please make sure that the network
>> link is
>> active on the node and the hardware is functioning.
>>
>>
>> Error: Failure in initializing endpoint
>>
>>
>> I gone through the link
>> http://www.open-mpi.org/community/lists/users/2011/12/17888.php for
>> solution, same followed but no luck.
>>
>> I exported the value in my input submit script file as export
>> PSM_SHAREDCONTEXTS_MAX=16, and submitted the job.
>>
>> Sample inputfile is
>> #!/bin/bash
>> #PBS -N matmul
>> #PBS -l nodes=1:ppn=1
>> node=1
>> ppn=1
>> nprocs=`expr ${node} \* ${ppn}`
>> echo "--- PBS_NODEFILE CONTENT ---"
>> cat $PBS_NODEFILE
>> export PSM_SHAREDCONTEXTS_MAX=16
>>
>>
>> mpirun -np ${nprocs} --hostfile $PBS_NODEFILE /home/khan/a.out <
>> /home/khan/iter
>>
>>
>>
>> Please let me know I doing correct or not ? and suggest me for best out ?
>>
>> Regards,
>>
>> Bhagya Raju K
>> <errfile.txt>_______________________________________________
>> devel mailing list
>> devel_at_[hidden]
>> http://www.open-mpi.org/mailman/listinfo.cgi/devel
>>
>>
>>
>> _______________________________________________
>> devel mailing list
>> devel_at_[hidden]
>> http://www.open-mpi.org/mailman/listinfo.cgi/devel
>>
>
>