Open MPI logo

Open MPI User's Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Open MPI User's mailing list

Subject: Re: [OMPI users] sge tight intregration leads to bad allocation
From: Reuti (reuti_at_[hidden])
Date: 2012-04-05 14:58:47


Am 05.04.2012 um 18:58 schrieb Eloi Gaudry:

> -----Original Message-----
> From: users-bounces_at_[hidden] [mailto:users-bounces_at_[hidden]] On Behalf Of Reuti
> Sent: jeudi 5 avril 2012 18:41
> To: Open MPI Users
> Subject: Re: [OMPI users] sge tight intregration leads to bad allocation
>
> Am 05.04.2012 um 17:55 schrieb Eloi Gaudry:
>
>>
>>>> Here are the allocation info retrieved from `qstat -g t` for the related job:
>>>
>>> For me the output of `qstat -g t` shows MASTER and SLAVE entries but no variables. Is there any wrapper defined for `qstat` to reformat the output (or a ~/.sge_qstat defined)?
>>>
>>> [eg: ] sorry, i forgot about sge_qstat being defined. As I don't have any slot available right now, I cannot relaunch the job to get the output updated.
>> Reuti, here is the output you asked two days ago.
>> It was produced with another "bad" run for which 3 processes are running on nodes charlie and carl... but we should have only 2 processes on carl and 4 on charlie...
>
> This is indeed strange, as it first detects the correct allocation. And it conforms to the one granted.
>
> - You used a plain `mpiexec` without and number of processes or machinesfile?
> [eg: ] I'm using orterun and i'm ony providing the number of process. Shouldn't I ?

With a tight integration it should work without specifying the number of ranks. But OTOH it shouldn't hurt if it's given.

-- Reuti

> - Can you please post while it's running the relevant lines from:
> ps -e f --cols=500
> (f w/o -) from both machines.
> It's allocated between the nodes more like in a round-robin fashion.
> [eg: ] I'll try to do this tomorrow, as soon as some slots become free. Thanks for your feedback Reuti, I appreciate.
>
>
> _______________________________________________
> users mailing list
> users_at_[hidden]
> http://www.open-mpi.org/mailman/listinfo.cgi/users