Open MPI logo

Open MPI Development Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Development mailing list

Subject: Re: [OMPI devel] Openmpi-1.5.3 issue " initialization failure on /dev/ipath (err=23)"
From: Jeffrey Squyres (jsquyres_at_[hidden])
Date: 2012-03-29 11:40:26


I didn't realize from your text that the SHAREDCONTEXTS_MAX value made it work.

If so, I would assume that is a good solution. But I don't know for sure; you might well need to contact QLogic and ask.

On Mar 29, 2012, at 11:34 AM, Raju wrote:

> Hi Jeffrey,
>
> Thanks for that i will contact them... as i mentioned earlier.. OpenMPI developers has provided the solution that we need to set the value for PSM_SHAREDCONTEXTS_MAX="some value"....
>
> I kept in input file as export PSM_SHAREDCONTEXTS_MAX=16.. Correct me i have to do it same way or any other ways...
>
> Regards
> Raju...
>
> On Thu, Mar 29, 2012 at 8:58 PM, Jeffrey Squyres <jsquyres_at_[hidden]> wrote:
> This looks like a PSM problem (PSM is the layer than runs below Open MPI on QLogic NICs). You might need to contact QLogic tech support to find out how to solve it.
>
>
> On Mar 29, 2012, at 11:26 AM, Raju wrote:
>
> > Hi Ralph,
> >
> > I recompiled OMPI with --with-tm option, but still same issue... I changed the input file as below... Please let me know what i have to fine tune and verify
> >
> > #!/bin/bash
> > #PBS -N matmul
> > #PBS -l nodes=1:ppn=1
> > node=1
> > ppn=1
> > nprocs=`expr ${node} \* ${ppn}`
> > export PSM_SHAREDCONTEXTS_MAX=16
> >
> > mpirun -np ${nprocs} /home/khan/a.out < /home/khan/iter
> >
> > Regards,
> > Raju...
> >
> > On Thu, Mar 29, 2012 at 8:49 PM, Raju <brajuk_at_[hidden]> wrote:
> > Hi Ralph,
> >
> > Thanks for the very quick response, I did compiled with -tm option i am doing now, once it done i will revert back...
> >
> > Thanks
> > Raju..
> >
> >
> > On Thu, Mar 29, 2012 at 8:29 PM, Ralph Castain <rhc_at_[hidden]> wrote:
> > One thing stands out right away: why are you specifying a hostfile? Did you remember to configure OMPI with --with-tm so we launch via Torque? If not, then you could hit issues as you are actually attempting to launch via ssh, which has implications on a Torque-based system.
> >
> >
> > On Mar 29, 2012, at 8:51 AM, Raju wrote:
> >
> >> Hi Team,
> >>
> >> I am using Qlogic Infiniband and Openmpi-1.5.3. I can able to run the jobs by CLI without any issues, but when iam submitting over torque scheduler facing the below issue.
> >>
> >> I am facing issue while submitting the jobs through Torque scheduler. Error file is attached
> >>
> >> Overview of the problem:
> >>
> >> node1.ibab.ac.in.5910Driver initialization failure on /dev/ipath (err=23)
> >> --------------------------------------------------------------------------
> >> PSM was unable to open an endpoint. Please make sure that the network link is
> >> active on the node and the hardware is functioning.
> >>
> >> Error: Failure in initializing endpoint
> >>
> >> I gone through the link http://www.open-mpi.org/community/lists/users/2011/12/17888.php for solution, same followed but no luck.
> >>
> >> I exported the value in my input submit script file as export PSM_SHAREDCONTEXTS_MAX=16, and submitted the job.
> >>
> >> Sample inputfile is
> >>
> >> #!/bin/bash
> >> #PBS -N matmul
> >> #PBS -l nodes=1:ppn=1
> >> node=1
> >> ppn=1
> >> nprocs=`expr ${node} \* ${ppn}`
> >> echo "--- PBS_NODEFILE CONTENT ---"
> >> cat $PBS_NODEFILE
> >> export PSM_SHAREDCONTEXTS_MAX=16
> >>
> >> mpirun -np ${nprocs} --hostfile $PBS_NODEFILE /home/khan/a.out < /home/khan/iter
> >>
> >> Please let me know I doing correct or not ? and suggest me for best out ?
> >>
> >> Regards,
> >> Bhagya Raju K
> >> <errfile.txt>_______________________________________________
> >> devel mailing list
> >> devel_at_[hidden]
> >> http://www.open-mpi.org/mailman/listinfo.cgi/devel
> >
> >
> > _______________________________________________
> > devel mailing list
> > devel_at_[hidden]
> > http://www.open-mpi.org/mailman/listinfo.cgi/devel
> >
> >
> > _______________________________________________
> > devel mailing list
> > devel_at_[hidden]
> > http://www.open-mpi.org/mailman/listinfo.cgi/devel
>
>
> --
> Jeff Squyres
> jsquyres_at_[hidden]
> For corporate legal information go to: http://www.cisco.com/web/about/doing_business/legal/cri/
>
>
> _______________________________________________
> devel mailing list
> devel_at_[hidden]
> http://www.open-mpi.org/mailman/listinfo.cgi/devel
>
> _______________________________________________
> devel mailing list
> devel_at_[hidden]
> http://www.open-mpi.org/mailman/listinfo.cgi/devel

-- 
Jeff Squyres
jsquyres_at_[hidden]
For corporate legal information go to: http://www.cisco.com/web/about/doing_business/legal/cri/