This web mail archive is frozen.
This page is part of a frozen web archive of this mailing list.
You can still navigate around this archive, but know that no new mails
have been added to it since July of 2016.
Click here to be taken to the new web archives of this list; it includes all the mails that are in this frozen archive plus all new mails that have been sent to the list since it was migrated to the new archives.
On Apr 13, 2011, at 8:13 AM, Rushton Martin wrote:
> The bulk of our compute nodes are 8 cores (twin 4-core IBM x3550-m2).
> Jobs are submitted by Torque/MOAB. When run with up to np=8 there is
> good performance. Attempting to run with more processors brings
> problems, specifically if any one node of a group of nodes has all 8
> cores in use the job hangs. For instance running with 14 cores (7+7) is
> fine, but running with 16 (8+8) hangs.
>> From the FAQs I note the issues of over committing and aggressive
> scheduling. Is it possible for mpirun (or orted on the remote nodes) to
> be blocked from progressing by a fully committed node? We have a few
> x3755-m2 machines with 16 cores, and we have detected a similar issue
> with 16+16.
I'm not entirely sure I understand your notation, but we have never seen an issue when running with fully loaded nodes (i.e., where the number of MPI procs on the node = the number of cores).
What version of OMPI are you using? Are you binding the procs?
> Martin Rushton
> HPC System Manager, Weapons Technologies
> Tel: 01959 514777, Mobile: 07939 219057
> email: jmrushton_at_[hidden]
> QinetiQ - Delivering customer-focused solutions
> Please consider the environment before printing this email.
> This email and any attachments to it may be confidential and are
> intended solely for the use of the individual to whom it is
> addressed. If you are not the intended recipient of this email,
> you must neither take any action based upon its contents, nor
> copy or show it to anyone. Please contact the sender if you
> believe you have received this email in error. QinetiQ may
> monitor email traffic data and also the content of email for
> the purposes of security. QinetiQ Limited (Registered in England
> & Wales: Company Number: 3796233) Registered office: Cody Technology
> Park, Ively Road, Farnborough, Hampshire, GU14 0LX http://www.qinetiq.com.
> users mailing list