Open MPI logo

Open MPI User's Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Open MPI User's mailing list

Subject: Re: [OMPI users] srun and openmpi
From: Michael Di Domenico (mdidomenico4_at_[hidden])
Date: 2010-12-30 11:57:06


I think i take it all back. I just tried it again and it seems to
work now. I'm not sure what I changed (between my first and this
msg), but it does appear to work now.

On Thu, Dec 30, 2010 at 4:31 PM, Michael Di Domenico
<mdidomenico4_at_[hidden]> wrote:
> Yes that's true, error messages help.  I was hoping there was some
> documentation to see what i've done wrong.  I can't easily cut and
> paste errors from my cluster.
>
> Here's a snippet (hand typed) of the error message, but it does look
> like a rank communications error
>
> ORTE_ERROR_LOG: A message is attempting to be sent to a process whose
> contact information is unknown in file rml_oob_send.c at line 145.
> *** MPI_INIT failure message (snipped) ***
> orte_grpcomm_modex failed
> --> Returned "A messages is attempting to be sent to a process whose
> contact information us uknown" (-117) instead of "Success" (0)
>
> This msg repeats for each rank, an ultimately hangs the srun which i
> have to Ctrl-C and terminate
>
> I have mpiports defined in my slurm config and running srun with
> -resv-ports does show the SLURM_RESV_PORTS environment variable
> getting parts to the shell
>
>
> On Thu, Dec 23, 2010 at 8:09 PM, Ralph Castain <rhc_at_[hidden]> wrote:
>> I'm not sure there is any documentation yet - not much clamor for it. :-/
>>
>> It would really help if you included the error message. Otherwise, all I can do is guess, which wastes both of our time :-(
>>
>> My best guess is that the port reservation didn't get passed down to the MPI procs properly - but that's just a guess.
>>
>>
>> On Dec 23, 2010, at 12:46 PM, Michael Di Domenico wrote:
>>
>>> Can anyone point me towards the most recent documentation for using
>>> srun and openmpi?
>>>
>>> I followed what i found on the web with enabling the MpiPorts config
>>> in slurm and using the --resv-ports switch, but I'm getting an error
>>> from openmpi during setup.
>>>
>>> I'm using Slurm 2.1.15 and Openmpi 1.5 w/PSM
>>>
>>> I'm sure I'm missing a step.
>>>
>>> Thanks
>>> _______________________________________________
>>> users mailing list
>>> users_at_[hidden]
>>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>>
>>
>> _______________________________________________
>> users mailing list
>> users_at_[hidden]
>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>>
>