I'm hoping this is just user error...
I'm running a single-node job with a node that has two dual-core opterons
(Open MPI 1.0.2).
My machine file looked like this:
I have an HPL configuration for 4 processors (PxQ=2x2)
I started with 'mpirun -np 4 -machinefile foo ./xhpl'
And the problem takes 15 seconds to complete.
I change the machinefile to read:
It doesn't matter which machinefile I use; I still execute it with:
'mpirun -np 4 -machinefile foo ./xhpl'
Except now the problem takes 0.1 sec to complete.
It's perfectly repeatable...
Is there something about the machine file format I'm not aware of (with
respect to dual-core CPUs)? IIRC, slots=(num of processes to run per
node); so two dual-cores should be slots=4. Except 'slots=4' makes it run
a few orders of magnitude slower.