On Oct 31, 2007, at 10:45 AM, Karsten Bolding wrote:
> In a different thread I read about a performance penalty in OpenMPI if
> more than one MPI-process is running on one processor/core - is that
> correct? I mean having max-slots>4 on a quad-core machine.
Open MPI polls for message passing progress (to get the absolute
minimum latency -- it can be faster than blocking/waking up). If you
overload a machine, Open MPI will usually detect that and know to
call yield() in the middle of its polling so that other processes can
get swapped in and make progress.
But if you lie to Open MPI and tell it that there are more processors
than there really are, we may not recognize that the machine is
oversubscribed and therefore not call yield(). Hence, performance
will *really* go down the drain.