we've tried to use a multithreaded application with a more recent trunk version (March 21) of OpenMPI. We need to use this version because of CUDA RDMA support. OpenMPI was binding all the threads to a single core, which is undesirable.
In OpenMPI 1.5. there was an option --cpus-per-rank, which should have helped in this case, or --bind-to-none.
Unfortunately, these options are now gone and I couldn't figure out how to make it work with the newest version.
Can anyone offer any hints on this?