Open MPI logo

Open MPI User's Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Open MPI User's mailing list

Subject: Re: [OMPI users] Error when using OpenMPI with SGE multiple hosts
From: Reuti (reuti_at_[hidden])
Date: 2010-11-18 06:42:51


Am 18.11.2010 um 11:57 schrieb Terry Dontje:

> Yes, I believe this solves the mystery. In short OGE and ORTE both work. In the linear:1 case the job is exiting because there are not enough resources for the orte binding to work, which actually makes sense. In the linear:2 case I think we've proven that we are binding to the right amount of resources and to the correct physical resources at the process level.
>
> In the case you do not do pass bind-to-core to mpirun with a qsub using linear:2 the processes on the same node will actually bind to the same two cores. The only way to determine this is to run something that prints out the binding from the system. There is no way to do this via OMPI because it only reports binding when you are requesting mpirun to do some type of binding (like -bind-to-core or -bind-to-socket.
>
> In the linear:1 case with no binding I think you are having the processes on the same node run on the same core. Which is exactly what you are asking for I believe.
>
> So I believe we understand what is going on with the binding and it makes sense to me. As far as the allocation issue of slots vs. cores and trying to not overallocate cores I believe the new allocation rule make sense to do but I'll let you hash that out with Daniel.

I still vote for a flag "limit_to_one_qrsh_per_host true/false" in the PE definition which a) checks whether any attempt is made to make a second `qrsh -inherit ...` to one and the same node (similar to the "job_is_first_task" to allow or deny a local `qrsh -inherit ...`), and b) as a side effect allocate *all* cores to this one and only started shepherd then.

And a second one "limit_cores_by_slot_count true/false" instead of new allocation_rules. To choose $fillup, $round_robin or others is independent from limiting it IMO.

-- Reuti

> In summary I don't believe there is any OMPI bugs related to what we've seen and the OGE issue is just the allocation issue, right?
>
> --td
>
>
> On 11/18/2010 01:32 AM, Chris Jewell wrote:
>>>> Perhaps if someone could run this test again with --report-bindings --leave-session-attached and provide -all- output we could verify that analysis and clear up the confusion?
>>>>
>>>>
>>> Yeah, however I bet you we still won't see output.
>>>
>> Actually, it seems we do get more output! Results of 'qsub -pe mpi 8 -binding linear:2 myScript.com'
>>
>> with
>>
>> 'mpirun -mca ras_gridengine_verbose 100 -report-bindings --leave-session-attached -bycore -bind-to-core ./unterm'
>>
>> [exec1:06504] System has detected external process binding to cores 0028
>> [exec1:06504] ras:gridengine: JOB_ID: 59467
>> [exec1:06504] ras:gridengine: PE_HOSTFILE: /usr/sge/default/spool/exec1/active_jobs/59467.1/pe_hostfile
>> [exec1:06504] ras:gridengine: exec1.cluster.stats.local: PE_HOSTFILE shows slots=2
>> [exec1:06504] ras:gridengine: exec3.cluster.stats.local: PE_HOSTFILE shows slots=1
>> [exec1:06504] ras:gridengine: exec2.cluster.stats.local: PE_HOSTFILE shows slots=1
>> [exec1:06504] ras:gridengine: exec7.cluster.stats.local: PE_HOSTFILE shows slots=1
>> [exec1:06504] ras:gridengine: exec4.cluster.stats.local: PE_HOSTFILE shows slots=1
>> [exec1:06504] ras:gridengine: exec5.cluster.stats.local: PE_HOSTFILE shows slots=1
>> [exec1:06504] ras:gridengine: exec6.cluster.stats.local: PE_HOSTFILE shows slots=1
>> [exec1:06504] [[59608,0],0] odls:default:fork binding child [[59608,1],0] to cpus 0008
>> [exec1:06504] [[59608,0],0] odls:default:fork binding child [[59608,1],1] to cpus 0020
>> [exec3:20248] [[59608,0],1] odls:default:fork binding child [[59608,1],2] to cpus 0008
>> [exec4:26792] [[59608,0],4] odls:default:fork binding child [[59608,1],5] to cpus 0001
>> [exec2:32462] [[59608,0],2] odls:default:fork binding child [[59608,1],3] to cpus 0001
>> [exec7:09833] [[59608,0],3] odls:default:fork binding child [[59608,1],4] to cpus 0002
>> [exec5:10834] [[59608,0],5] odls:default:fork binding child [[59608,1],6] to cpus 0001
>> [exec6:04230] [[59608,0],6] odls:default:fork binding child [[59608,1],7] to cpus 0001
>>
>> AHHA! Now I get the following if I use 'qsub -pe mpi 8 -binding linear:1 myScript.com' with the above mpirun command:
>>
>> [exec1:06552] System has detected external process binding to cores 0020
>> [exec1:06552] ras:gridengine: JOB_ID: 59468
>> [exec1:06552] ras:gridengine: PE_HOSTFILE: /usr/sge/default/spool/exec1/active_jobs/59468.1/pe_hostfile
>> [exec1:06552] ras:gridengine: exec1.cluster.stats.local: PE_HOSTFILE shows slots=2
>> [exec1:06552] ras:gridengine: exec3.cluster.stats.local: PE_HOSTFILE shows slots=1
>> [exec1:06552] ras:gridengine: exec2.cluster.stats.local: PE_HOSTFILE shows slots=1
>> [exec1:06552] ras:gridengine: exec7.cluster.stats.local: PE_HOSTFILE shows slots=1
>> [exec1:06552] ras:gridengine: exec4.cluster.stats.local: PE_HOSTFILE shows slots=1
>> [exec1:06552] ras:gridengine: exec5.cluster.stats.local: PE_HOSTFILE shows slots=1
>> [exec1:06552] ras:gridengine: exec6.cluster.stats.local: PE_HOSTFILE shows slots=1
>> --------------------------------------------------------------------------
>> mpirun was unable to start the specified application as it encountered an error:
>>
>> Error name: Unknown error: 1
>> Node: exec1
>>
>> when attempting to start process rank 0.
>> --------------------------------------------------------------------------
>> [exec1:06552] [[59432,0],0] odls:default:fork binding child [[59432,1],0] to cpus 0020
>> --------------------------------------------------------------------------
>> Not enough processors were found on the local host to meet the requested
>> binding action:
>>
>> Local host: exec1
>> Action requested: bind-to-core
>> Application name: ./unterm
>>
>> Please revise the request and try again.
>> --------------------------------------------------------------------------
>> [exec4:26816] [[59432,0],4] odls:default:fork binding child [[59432,1],5] to cpus 0001
>> [exec3:20345] [[59432,0],1] odls:default:fork binding child [[59432,1],2] to cpus 0020
>> [exec2:32486] [[59432,0],2] odls:default:fork binding child [[59432,1],3] to cpus 0001
>> [exec7:09921] [[59432,0],3] odls:default:fork binding child [[59432,1],4] to cpus 0002
>> [exec6:04257] [[59432,0],6] odls:default:fork binding child [[59432,1],7] to cpus 0001
>> [exec5:10861] [[59432,0],5] odls:default:fork binding child [[59432,1],6] to cpus 0001
>>
>>
>>
>> Hope that helps clear up the confusion! Please say it does, my head hurts...
>>
>> Chris
>>
>>
>> --
>> Dr Chris Jewell
>> Department of Statistics
>> University of Warwick
>> Coventry
>> CV4 7AL
>> UK
>> Tel: +44 (0)24 7615 0778
>>
>>
>>
>>
>>
>>
>> _______________________________________________
>> users mailing list
>>
>> users_at_[hidden]
>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>
>
> --
> <Mail-Anhang.gif>
> Terry D. Dontje | Principal Software Engineer
> Developer Tools Engineering | +1.781.442.2631
> Oracle - Performance Technologies
> 95 Network Drive, Burlington, MA 01803
> Email terry.dontje_at_[hidden]
>
>
>
> _______________________________________________
> users mailing list
> users_at_[hidden]
> http://www.open-mpi.org/mailman/listinfo.cgi/users