Open MPI logo

Open MPI User's Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Open MPI User's mailing list

Subject: Re: [OMPI users] Segfaults w/ both 1.4 and 1.5 on CentOS 6.2/SGE
From: Reuti (reuti_at_[hidden])
Date: 2012-03-15 10:46:01


Am 15.03.2012 um 15:37 schrieb Ralph Castain:

> Just to be clear: I take it that the first entry is the host name, and the second is the number of slots allocated on that host?

This is correct.

> FWIW: I see the problem. Our parser was apparently written assuming every line was a unique host, so it doesn't even check to see if there is duplication. Easy fix - can shoot it to you today.

But even with the fix the nice value will be the same for all processes forked there. Either all have the nice value of his low priority queue or the high priority queue.

-- Reuti

> On Mar 15, 2012, at 6:53 AM, Reuti wrote:
>
>> Am 15.03.2012 um 05:22 schrieb Joshua Baker-LePain:
>>
>>> On Wed, 14 Mar 2012 at 5:50pm, Ralph Castain wrote
>>>
>>>> On Mar 14, 2012, at 5:44 PM, Reuti wrote:
>>>
>>>>> (I was just typing when Ralph's message came in: I can confirm this. To avoid it, it would mean for Open MPI to collect all lines from the hostfile which are on the same machine. SGE creates entries for each queue/host pair in the machine file).
>>>>
>>>> Hmmm…I can take a look at the allocator module and see why we aren't doing it. Would the host names be the same for the two queues?
>>>
>>> I can't speak authoritatively like Reuti can, but here's what a hostfile
>>> looks like on my cluster (note that all our name resolution is done via /etc/hosts -- there's no DNS involved):
>>>
>>> iq103 8 lab.q_at_iq103 <NULL>
>>> iq103 1 test.q_at_iq103 <NULL>
>>> iq104 8 lab.q_at_iq104 <NULL>
>>> iq104 1 test.q_at_iq104 <NULL>
>>> opt221 2 lab.q_at_opt221 <NULL>
>>> opt221 1 test.q_at_opt221 <NULL>
>>
>> Yes, exactly this needs to be parsed and adding up all entries therein for one and the same machine.
>>
>> If you need it instantly, it could be put in a wrapper for start_proc_args of the PE (and Open MPI compiled without SGE support), so that a custom build machinefile can be used. In this case the rsh resp. ssh call also needs to be caught.
>>
>> Often the opposite is desired in an SGE setup: tune it so that all slots are coming from one queue only.
>>
>> But I still wonder whether it is possible to tune your setup in a similar way: allow one slot more in the high priority queue (long,.q) in case it's a parallel job, with an RQS (assuming 8 cores with one core oversubscription):
>>
>> limit queues long.q pes * to slots=9
>> limit queues long.q to slots=8
>>
>> while you have an additonal short.q (the low priority queue) there with one slot. The overall limit is still set on an exechost level to 9. The PE is then only attached to long.q.
>>
>> -- Reuti
>>
>> PS: In your example you also had the case 2 slots in the low priority queue, what is the actual setup in your cluster?
>>
>>
>>>>> @Ralph: it could work if SGE would have a facility to request the desired queue in `qrsh -inherit ...`, because then the $TMPDIR would be unique for each orted again (assuming its using different ports for each).
>>>>
>>>> Gotcha! I suspect getting the allocator to handle this cleanly is the better solution, though.
>>>
>>> If I can help (testing patches, e.g.), let me know.
>>>
>>> --
>>> Joshua Baker-LePain
>>> QB3 Shared Cluster Sysadmin
>>> UCSF_______________________________________________
>>> users mailing list
>>> users_at_[hidden]
>>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>>
>>
>> _______________________________________________
>> users mailing list
>> users_at_[hidden]
>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>
>
> _______________________________________________
> users mailing list
> users_at_[hidden]
> http://www.open-mpi.org/mailman/listinfo.cgi/users
>