Am 20.11.2006 um 13:12 schrieb Epitropakis Mixalis 00064:
> Hello everyone!
> I am a member of a research laboratory in University of Patras,
> We have ordered our first cluster and in the following days it will
> arrive. So, we will need the help of experts in order to decide which
> cluster management and job scheduling software is the most suitable
> for it
> :) .
> Each computer of the cluster consists of: 2 x (Dual-Core Intel Xeon
> Processor 5060 (3.2 GHz, 1066 MHz Bus)), (motherboard: S5000PAL)
> , 4GB ECC RAM and 250GB HDD. All parts are interconnected with a
> Gigabit Ethernet Switch.
> What we need, is your opinion and experience for a software package
> (or a collection) that will make easier the use of the cluster (job
> scheduling) as well as the administration of it (update and upgrade of
> the OS, installation of new software, user administration, etc). We
> proficient with Linux administration (any distro).
> On our first search in the internet, we found some packages that do
> combine both job scheduling and administration. If there is a
> package that
> could be suggested and that could combine both we would be really
> We would prefer the software packages to be open source. :)
> Some of them found and studied so far are the following:
>  TORQUE Resource Manager
>  http://gridengine.sunsource.net/
>  http://oscar.openclustergroup.org/
>  http://dcc.irb.hr/
I think this question is of broader audience on the beowulf.org
mailing list, but anyway: what are you using in the cluster besides
OpenMPI? Although I'm biased, I would suggest SGE GridEngine, as it
supports more parallel libs than Torque by its qrsh replacement; e.g.
Linda or PVM. Also the integration between the qmaster and scheduler
is tighter. In Torque you have two commands: "qstat" and "showq". The
former is the view of the cluster by Torque, the latter the one of
the Maui scheduler - and sometimes I observe that they disagree about
what's running in the cluster and what not (we use SGE, but we have
access to some clusters in other locations which prefer Torque).
The support for SGE will be in OpenMPI in1.2 AFAIK.
Question: you have a central filer server in the cluster, to serve
the home directory to the nodes and which could also act as a NIS,
NTP and SGE qmaster server? You mentioned only the nodes.
> These are some of our thoughts. We know that the distribution choice
> as well as the cluster management software will apply only ONCE and we
> will not be able to test/change it easily...
> Thanks very much for your time and I am sure that your opinion will be
> of of great help to us!
> users mailing list