On Apr 6, 2010, at 4:59 PM, David Turner wrote:
> Hi Ralph,
>> Are you using a scheduler of some kind? If so, you can add this to your default mca param file:
> Yes, we are running torque/moab.
>> orte_allocation_required = 1
>> This will prevent anyone running without having an allocation. You can also set
> Ah. An "allocation". Not much info on this on the open-mpi website.
> I believe this is what we will want, to prevent mpirun on login nodes.
Yes - it was added specifically to solve similar problems we had on moab-scheduled clusters. The motivator was when someone forgot to get an allocation and ran a 256-process job - since we allow oversubscription, that hammered the login node into the ground.
With the above mca param set, mpirun will tell you "you need an allocation" and cleanly abort.
>> rmaps_base_no_schedule_local = 1
>> which tells mpirun not to schedule any MPI procs on the local node.
> In our batch environment, mpirun will be executing on one of the
> compute nodes. That is, we don't have dedicated MOM nodes.
> Therefore, I think we will want to schedule (at least) one MPI
> task on the same node. Actually, when somebody wants to run
> (for example) 256 tasks packed on 32 8-core nodes, I think we'll
> need mpirun to share a *core* with one of the MPI tasks. The above
> option would prevent that, correct?
Yeah, you don't want to set this one for a torque environment.
>> Does that solve the problem?
> I'll give it a try and let you know. Thanks!
>> On Apr 6, 2010, at 3:28 PM, David Turner wrote:
>>> Our cluster has a handful of login nodes, and then a bunch of
>>> compute nodes. OpenMPI is installed in a global file system
>>> visible from both sets of nodes. This means users can type
>>> "mpirun" from an interactive prompt, and quickly oversubscribe
>>> the login node.
>>> So, is there a way to explicitly exclude hosts from consideration
>>> for mpirun? To prevent (what is usually accidental) running
>>> MPI apps on our login nodes? Thanks!
>>> Best regards,
>>> David Turner
>>> User Services Group email: dpturner_at_[hidden]
>>> NERSC Division phone: (510) 486-4027
>>> Lawrence Berkeley Lab fax: (510) 486-4316
>>> users mailing list
>> users mailing list
> Best regards,
> David Turner
> User Services Group email: dpturner_at_[hidden]
> NERSC Division phone: (510) 486-4027
> Lawrence Berkeley Lab fax: (510) 486-4316
> users mailing list