Open MPI logo

Open MPI User's Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Open MPI User's mailing list

Subject: Re: [OMPI users] qsub error
From: Erik Nelson (nelsonerikd_at_[hidden])
Date: 2013-02-15 18:53:03


I may have deleted any responses to this message. In either case, we appear
to have fixed the problem
by installing a more current version of openmpi.

On Thu, Feb 14, 2013 at 2:27 PM, Erik Nelson <nelsonerikd_at_[hidden]> wrote:

>
> I'm encountering an error using qsub that none of us can figure out. MPI
> C++ programs seem to
> run fine when executed from the command line, but for some reason when I
> submit them through
> the queue I get a strange error message ..
>
>
> [compute-3-12.local][[58672,1],0][btl_tcp_endpoint.c:638:mca_btl_tcp_endpoint_complete_connect]
>
> connect() to 2002:8170:6c2f:b:21d:9ff:fefd:7d94 failed: Permission denied
> (13)
>
>
> the compute node 3-12 doesn't matter (the error can generate from any of
> the nodes, and I'm
> guessing that 3-12 is the parent node here).
>
> To check if there was some problem with my own code, I created a simple
> 'hello world' program
> (see attached files).
>
> Again, the program runs fine from the command line but fails in qsub with
> the same sort of error
> message.
>
> I have included (i) the code (ii) the job script for qsub, and (iii) the
> ".o" file from qsub for the
> "hello world" program.
>
> These don't look like MPI errors, but rather some conflict with, maybe,
> secure communication
> accross nodes.
>
> Is there something simple I can do to fix this?
>
> Thanks, Erik
>
> --
> Erik Nelson
>
> Howard Hughes Medical Institute
> 6001 Forest Park Blvd., Room ND10.124
> Dallas, Texas 75235-9050
>
> p : 214 645 5981
> f : 214 645 5948

-- 
Erik Nelson
Howard Hughes Medical Institute
6001 Forest Park Blvd., Room ND10.124
Dallas, Texas 75235-9050
p : 214 645 5981
f : 214 645 5948