Did you remember to set --bind-to-core or --bind-to-socket on the cmd line? Otherwise, the processes are running unbound, which makes a significant difference to performance.
On Jul 9, 2010, at 3:15 AM, Andreas Schäfer wrote:
> Maybe I should add that for tests I ran the benchmarks with two MPI
> processes: for InfiniBand one process per node and for shared memory
> both processes were located on one node.
> Andreas Schäfer
> HPC and Grid Computing
> Chair of Computer Science 3
> Friedrich-Alexander-Universität Erlangen-Nürnberg, Germany
> +49 9131 85-27910
> PGP/GPG key via keyserver
> I'm a bright... http://www.the-brights.net
> This is Bunny. Copy and paste Bunny into your
> signature to help him gain world domination!
> users mailing list