On Mar 20, 2014, at 12:48 PM, Beichuan Yan <beichuan.yan_at_[hidden]> wrote:
> 2. http://www.open-mpi.org/community/lists/users/2011/11/17684.php
> In the upcoming OMPI v1.7, we revamped the shared memory setup code such that it'll actually use /dev/shm properly, or use some other mechanism other than a mmap file backed in a real filesystem. So the issue goes away.
> my comment: up to OMPI v1.7.4, this shmem issue is still there. However, it is resolved in OMPI v1.7.5rc5. This is surprising.
> Anyway, OMPI v1.7.5rc5 works well for multi-processes-on-one-node (shmem) mode on Spirit. There is no need to tune TCP or IB parameters to use it. My code just runs well:
> My test data takes 20 minutes to run with OMPI v1.7.4, but needs less than 1 minute with OMPI v1.7.5rc5. I don't know what the magic is. I am wondering when OMPI v1.7.5 final will be released.
Wow -- that sounds like a fundamental difference there. Could be something to do with the NFS tmp directory...? I could see how that could cause oodles of unnecessary network traffic.
1.7.5 should be released ...immanently...
For corporate legal information go to: http://www.cisco.com/web/about/doing_business/legal/cri/