Open MPI logo

Open MPI User's Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Open MPI User's mailing list

Subject: Re: [OMPI users] srun and openmpi
From: Michael Di Domenico (mdidomenico4_at_[hidden])
Date: 2011-04-29 10:01:25


On Fri, Apr 29, 2011 at 4:52 AM, Ralph Castain <rhc_at_[hidden]> wrote:
> Hi Michael
>
> Please see the attached updated patch to try for 1.5.3. I mistakenly free'd the envar after adding it to the environ :-/

The patch works great, i can now see the precondition environment
variable if i do

mpirun -n 2 -host node1 <prog>

and my <prog> runs just fine, However if i do

srun --resv-ports -n 2 -w node1 <prog>

I get

[node1:16780] PSM EP connect error (unknown connect error):
[node1:16780]  node1
[node1:16780] PSM EP connect error (Endpoint could not be reached):
[node1:16780]  node1

PML add procs failed
--> Returned "Error" (-1) instead of "Success" (0)

I did notice a difference in the precondition env variable between the two runs

mpirun -n 2 -host node1 <prog>

sets precondition_transports=fbc383997ee1b668-00d40f1401d2e827 (which
changes with each run (aka random))

srun --resv-ports -n 2 -w node1 <prog>

sets precondition_transports=0000184500000000-0000000100000000 (which
doesn't seem to change run to run)