Hello,


First I would like to thank you for all your answers. :)


I do all my tests on the mom nodes requested through the queuing system. In other cases I cannot access the compute nodes. Also the installation needs to see the appropriate libs and header files - which are not available on the login nodes here. ;)


In my first test I used mpirun as this was build with alps support and should by this be able to handle the startup on the compute nodes.

I followed your suggestions and tried aprun too which gave me the same error.


A installation using the pmi 2.1.4 interface did not report errors but hangs silently during the startup process.


Best regards

Christoph


On Wednesday 10 October 2012 20:55:15 Ralph Castain wrote:

> Sorry - I saw the "pirun" cmd and thought it was some kind of Cray cmd

>

>

> Sent from my iPhone

>

> On Oct 10, 2012, at 9:11 AM, Nathan Hjelm <hjelmn@lanl.gov> wrote:

> > He is using mpirun from what I can see. And in this case the orted will

> > use PMI but the app will use the tcp oob to talk to the orted since

> > there is no shmem oob atm.

> >

> > -Nathan

> >

> > On Wed, Oct 10, 2012 at 08:04:20AM -0700, Ralph Castain wrote:

> >> Hi Nathan

> >>

> >> The only way to get that OOB error is if PMI isn't running - hence my

> >> earlier note. If PMI isn't actually running, then we fall back to the

> >> TCP OOB and try to open sockets - which won't work because the app is

> >> being direct-launched.

> >>

> >> Alternatively, he could launch using "mpirun" and then it should work

> >> just fine.

> >>

> >> On Wed, Oct 10, 2012 at 7:59 AM, Nathan Hjelm <hjelmn@lanl.gov> wrote:

> >>> On Wed, Oct 10, 2012 at 02:50:59PM +0200, Christoph Niethammer wrote:

> >>>> Hello,

> >>>>

> >>>> I just tried to use Open MPI 1.7a1r27416 on a Cray XE6 system.

> >>>

> >>> Unfortunately I

> >>>

> >>>> get the following error when I run a simple HelloWorldMPI program:

> >>>>

> >>>> $ pirun HelloWorldMPI

> >>>> App launch reported: 2 (out of 2) daemons - 0 (out of 32) procs

> >>>> ...

> >>>

> >>>> [unset]:_pmi_alps_get_appLayout:pmi_alps_get_apid returned with error:

> >>> Bad

> >>>

> >>>> file descriptor

> >>>

> >>> There is a bug in Cray's PMI-3 which causes this error message. Change

> >>> the platform file to point at PMI 2.1.4. I was hoping Cray would fix

> >>> the bug before 1.7.0. Since that doesn't appear to be the case I will

> >>> push updated platform files that use PMI 2.1.4 instead of the default.

> >>>

> >>>> [nid01766:20603] mca_oob_tcp_init: unable to create IPv4 listen socket:

> >>> Unable

> >>>

> >>>> to open a TCP socket for out-of-band communications

> >>>> ...

> >>>

> >>> Never seen this error before. What PE release is installed?

> >>>

> >>> -Nathan

> >>> _______________________________________________

> >>> users mailing list

> >>> users@open-mpi.org

> >>> http://www.open-mpi.org/mailman/listinfo.cgi/users

> >>

> >> _______________________________________________

> >> users mailing list

> >> users@open-mpi.org

> >> http://www.open-mpi.org/mailman/listinfo.cgi/users

> >

> > _______________________________________________

> > users mailing list

> > users@open-mpi.org

> > http://www.open-mpi.org/mailman/listinfo.cgi/users

>

> _______________________________________________

> users mailing list

> users@open-mpi.org

> http://www.open-mpi.org/mailman/listinfo.cgi/users