Hi openmpi developers,
I have been evaluating our FEM aplication with new openmpi-1.7rc7 under
Torque job controler.
Now I encountered a trouble that "-hostfile" does not work properly.
Since my application is hybrid(MPI+OpenMP), I have to modify
$PBS_NODEFILE and use "-hostfile".
I don't add new hosts to the hostfile according to FAQ. It's just a
subset of the hosts allocated to the Torque. At leaset, this method
works well with openmpi-1.6.x.
I hope this issue will be fixed in the next release of openmpi-1.7.
Best Regards,
Tetsuya Mishima
(1) Example of 2MPI having 4 threads:
$PBS_NODEFILE -> modified hostfile
node01 node01
node01 node02
node01
node01
node02
node02
node02
node02
(2) The error message I got is as follows:
--------------------------------------------------------------------------
A hostfile was provided that contains at least one node not
present in the allocation:
hostfile: pbs_hosts
node: node01
If you are operating in a resource-managed environment, then only
nodes that are in the allocation can be used in the hostfile. You
may find relative node syntax to be a useful alternative to
specifying absolute node names see the orte_hosts man page for
further information.
--------------------------------------------------------------------------
|