Open MPI logo

Open MPI User's Mailing List Archives

  |   Home   |   Support   |   FAQ   |  

This web mail archive is frozen.

This page is part of a frozen web archive of this mailing list.

You can still navigate around this archive, but know that no new mails have been added to it since July of 2016.

Click here to be taken to the new web archives of this list; it includes all the mails that are in this frozen archive plus all new mails that have been sent to the list since it was migrated to the new archives.

Subject: Re: [OMPI users] srun and openmpi
From: Ralph Castain (rhc_at_[hidden])
Date: 2010-12-30 12:13:52


Hooray!

On Dec 30, 2010, at 9:57 AM, Michael Di Domenico wrote:

> I think i take it all back. I just tried it again and it seems to
> work now. I'm not sure what I changed (between my first and this
> msg), but it does appear to work now.
>
> On Thu, Dec 30, 2010 at 4:31 PM, Michael Di Domenico
> <mdidomenico4_at_[hidden]> wrote:
>> Yes that's true, error messages help. I was hoping there was some
>> documentation to see what i've done wrong. I can't easily cut and
>> paste errors from my cluster.
>>
>> Here's a snippet (hand typed) of the error message, but it does look
>> like a rank communications error
>>
>> ORTE_ERROR_LOG: A message is attempting to be sent to a process whose
>> contact information is unknown in file rml_oob_send.c at line 145.
>> *** MPI_INIT failure message (snipped) ***
>> orte_grpcomm_modex failed
>> --> Returned "A messages is attempting to be sent to a process whose
>> contact information us uknown" (-117) instead of "Success" (0)
>>
>> This msg repeats for each rank, an ultimately hangs the srun which i
>> have to Ctrl-C and terminate
>>
>> I have mpiports defined in my slurm config and running srun with
>> -resv-ports does show the SLURM_RESV_PORTS environment variable
>> getting parts to the shell
>>
>>
>> On Thu, Dec 23, 2010 at 8:09 PM, Ralph Castain <rhc_at_[hidden]> wrote:
>>> I'm not sure there is any documentation yet - not much clamor for it. :-/
>>>
>>> It would really help if you included the error message. Otherwise, all I can do is guess, which wastes both of our time :-(
>>>
>>> My best guess is that the port reservation didn't get passed down to the MPI procs properly - but that's just a guess.
>>>
>>>
>>> On Dec 23, 2010, at 12:46 PM, Michael Di Domenico wrote:
>>>
>>>> Can anyone point me towards the most recent documentation for using
>>>> srun and openmpi?
>>>>
>>>> I followed what i found on the web with enabling the MpiPorts config
>>>> in slurm and using the --resv-ports switch, but I'm getting an error
>>>> from openmpi during setup.
>>>>
>>>> I'm using Slurm 2.1.15 and Openmpi 1.5 w/PSM
>>>>
>>>> I'm sure I'm missing a step.
>>>>
>>>> Thanks
>>>> _______________________________________________
>>>> users mailing list
>>>> users_at_[hidden]
>>>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>>>
>>>
>>> _______________________________________________
>>> users mailing list
>>> users_at_[hidden]
>>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>>>
>>
>
> _______________________________________________
> users mailing list
> users_at_[hidden]
> http://www.open-mpi.org/mailman/listinfo.cgi/users