To sum up and give an update:
The extended communication times while using shared memory communication
of openmpi processes are caused by openmpi session directory laying on
the network via NFS.
The problem is resolved by establishing on each diskless node a ramdisk
or mounting a tmpfs. By setting the MCA parameter orte_tmpdir_base to
point to the according mountpoint shared memory communication and its
files are kept local, thus decreasing the communication times by magnitudes.
The relation of the problem to the kernel version is not really
resolved, but maybe not "the problem" in this respect.
My benchmark is now running fine on a single node with 4 CPU, kernel
184.108.40.206 and openmpi 1.4.1.
Running on multiple nodes I experience still higher (TCP) communication
times than I would expect. But that requires me some more deep
researching the issue (e.g. collisions on the network) and should
probably posted to a new thread.
Thank you guys for your help.
This message has been scanned for viruses and
dangerous content by MailScanner, and is
believed to be clean.