Open MPI logo

Open MPI User's Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Open MPI User's mailing list

Subject: Re: [OMPI users] control openmpi or force to use pbs?
From: Gus Correa (gus_at_[hidden])
Date: 2013-02-05 13:03:19


On 02/05/2013 08:52 AM, Jeff Squyres (jsquyres) wrote:
> To add to what Reuti said, if you enable PBS support in Open MPI, when users "mpirun ..." in a PBS job, Open MPI will automatically use the PBS native launching mechanism, which won't let you run outside of the servers allocated to that job.
>
> Concrete example: if you qsub a job and are allocated node A, B, and C, but then try to run with "mpirun --host D,E,F ...", you'll get an error.
>
> That being said -- keep in mind what Reuti said: if users are allowed to ssh between nodes that are not allocated to them, then they can always bypass this behavior and use just Open MPI's ssh support to launch on nodes D, E, F (etc.).
>
>
>
> On Feb 5, 2013, at 2:46 AM, Reuti<reuti_at_[hidden]> wrote:
>
>> -----BEGIN PGP SIGNED MESSAGE-----
>> Hash: SHA1
>>
>> Am 05.02.2013 um 11:24 schrieb Duke Nguyen:
>>
>>> Please advise me how to force our users to use pbs instead of "mpirun
>>> --hostfile"? Or how do I control mpirun so that any user using "mpirun
>>> --hostfile" will not overload the cluster? We have OpenMPI installed
>>> with Torque/Maui and we can control users's limits (total number of
>>> procs, total nodes, total memories etc) with Torque/Maui, but if a user
>>> knows the cluster, and creates himself a hostfile with all the available
>>> nodes then he can use them all.
>>
>> Can the users use a plain ssh between the nodes? If they are forced to use the TM of Torque instead, it should be impossible to start a job on a non-granted machine.
>>
>> - -- Reuti
>>
>>
>>> Thanks,
>>>
>>> D.
>>>
>>> _______________________________________________
>>> users mailing list
>>> users_at_[hidden]
>>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>>
>> -----BEGIN PGP SIGNATURE-----
>> Version: GnuPG/MacGPG2 v2.0.18 (Darwin)
>> Comment: GPGTools - http://gpgtools.org
>>
>> iEYEARECAAYFAlEQ4w8ACgkQo/GbGkBRnRo/dQCgw/5R9Z0XiVvlp7R0LjNkIjWC
>> ixkAoJKYXi7fv4xiAVHLkT2rDApI1cXi
>> =xo+z
>> -----END PGP SIGNATURE-----
>>
>> _______________________________________________
>> users mailing list
>> users_at_[hidden]
>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>
>

Hi Duke

Besides Reuti's and Jeff's suggestions.

If you build your own Torque/PBS with PAM support
(./configure --with-pam [other configure flags]),
you can prevent users that are not running a Torque/PBS job
on a node to launch processes in that node.

See this:
http://docs.adaptivecomputing.com/torque/4-1-3/help.htm#topics/1-installConfig/customizingTheInstall.htm

Of course you will need to rebuild your OpenMPI with Torque
support again, after you install a version of Torque with PAM
support.

This is mostly a Torque/Maui issue, with a bit of an MPI issue.
You may get more help about this on the Torque and Maui
mailing lists, and in their archives you may find more specific
guidance on what you need to add to the pam/security
files to make it work.

Torque with PAM support is not 100% foolproof,
because users that *are* running a Torque/PBS job on
a node can still cheat and launch more processes there,
but it helps restrict the problem to this case.

Some sys admins also add a cleanup/sweep routine to the
Torque epilogue script to kill any processes belonging to
the user whose job just finished.
However, this not very good because that user may have another
legitimate job still running there.
Other cleanup strategies are possible, and you may find some
suggestions and even scripts if you google around.

Moreover, if you configure your scheduler (Maui?) to
assign full nodes to jobs (no node sharing),
the cheaters will be cheating on
themselves, not stepping on other users' toes.
Look for "JOBNODEMATCHPOLICY" here:
http://docs.adaptivecomputing.com/maui/a.fparameters.php

Assigning full nodes to jobs ("EXACTNODE") may or may not be a
good choice for you.
E.g. you may consider it wasteful, if there are many serial
jobs or parallel jobs running only on a few processors, in
which case you may want to pack those jobs in the fewest
nodes possible ("EXACTPROC"), so as to have a maximum throughput.
However, "no node sharing" helps preventing cheaters
to bother other users that are running jobs on the same node,
and it is not bad at all if most of the jobs are parallel
and use many cores (say, >= number of cores per node).

I hope this helps,
Gus Correa