I noticed that my OpenMPI processes are using larger amounts of system time
than user time (via vmstat, top). I'm running on dual-core, dual-CPU
Opterons, with 4 slots per node, where the program has the nodes to
themselves. A closer look showed that they are constantly switching between
run and sleep states with 4-8 page faults per second.
Why would this be? It doesn't happen with 4 sequential jobs running on a
node, where I get 99% user time, maybe 1% system time.
The processes have plenty of memory. This behavior occurs whether I use
processor/memory affinity or not (there is no oversubscription).