I've compiled OMPI 1.8 on a x64 linux cluster using the PGI compilers v14.1 (I've tried it with PGI v11.10 and get the same result). I'm able to compile with the resulting mpicc/mpifort/etc. When running the codes, everything seems to be working fine when there's only one job running on a given computing node. However, whenever a second job gets assigned the same computing node, the CPU load of every process gets divided by 2. I'm using pbs torque. As an example:
-Submit jobA using torque to node1 using mpirun -n 4
-All 4 rocesses of jobA show 100% CPU load.
-Submit jobB using torque to node1 using mpirun -n 4
-All 8 processes ( 4 from jobA & 4 from jobB ) show 50% CPU load.
Moreover, whilst jobA/jobB would run in 30 mins by itself; when both jobs are on the same node they've gone 14 hrs without completing.
I'm attaching config.log & the output of ompi_info --all (bzipped)
Some more info:
$> ompi_info | grep tm
MCA ess: tm (MCA v2.0, API v3.0, Component v1.8)
MCA plm: tm (MCA v2.0, API v2.0, Component v1.8)
MCA ras: tm (MCA v2.0, API v2.0, Component v1.8)
Sorry if this is a common problem but I've tried searching for posts discussing similar problems but haven't been able to find any.
Thanks for your help,