I may have deleted any responses to this message. In either case, we appear to have fixed the problem 
by installing a more current version of openmpi.


On Thu, Feb 14, 2013 at 2:27 PM, Erik Nelson <nelsonerikd@gmail.com> wrote:

I'm encountering an error using qsub that none of us can figure out. MPI C++ programs seem to
run fine when executed from the command line, but for some reason when I submit them through
the queue I get a strange error message ..


[compute-3-12.local][[58672,1],0][btl_tcp_endpoint.c:638:mca_btl_tcp_endpoint_complete_connect]
connect() to 2002:8170:6c2f:b:21d:9ff:fefd:7d94 failed: Permission denied (13)


the compute node 3-12 doesn't matter (the error can generate from any of the nodes, and I'm
guessing that 3-12 is the parent node here).

To check if there was some problem with my own code, I created a simple 'hello world' program
(see attached files).

Again, the program runs fine from the command line but fails in qsub with the same sort of error
message.

I have included (i) the code (ii) the job script for qsub, and (iii) the ".o" file from qsub for the
"hello world" program.

These don't look like MPI errors, but rather some conflict with, maybe, secure communication
accross nodes.

Is there something simple I can do to fix this?

Thanks, Erik

--
Erik Nelson

Howard Hughes Medical Institute
6001 Forest Park Blvd., Room ND10.124
Dallas, Texas 75235-9050

p : 214 645 5981
f : 214 645 5948



--
Erik Nelson

Howard Hughes Medical Institute
6001 Forest Park Blvd., Room ND10.124
Dallas, Texas 75235-9050

p : 214 645 5981
f : 214 645 5948