Sorry for the delay in replying.
Can you try upgrading to Open MPI 1.8, which was released last week? We refreshed the version of ROMIO that is included in OMPI 1.8 vs. 1.6.
On Apr 8, 2014, at 6:49 PM, Daniel Milroy <Daniel.Milroy_at_[hidden]> wrote:
> Recently a couple of our users have experienced difficulties with compute jobs failing with OpenMPI 1.6.4 compiled against GCC 4.7.2, with the nodes running kernel 2.6.32-279.5.2.el6.x86_64. The error is:
> File locking failed in ADIOI_Set_lock(fd 7,cmd F_SETLKW/7,type F_WRLCK/1,whence 0) with return value FFFFFFFF and errno 26.
> - If the file system is NFS, you need to use NFS version 3, ensure that the lockd daemon is running on all the machines, and mount the directory with the 'noac' option (no attribute caching).
> - If the file system is LUSTRE, ensure that the directory is mounted with the 'flock' option.
> ADIOI_Set_lock:: Function not implemented
> ADIOI_Set_lock:offset 0, length 8
> The file system in question is indeed Lustre, and mounting with flock isnt possible in our environment. I recommended the following changes to the users code:
> MPI_Info_set(info, "collective_buffering", "true");
> MPI_Info_set(info, "romio_lustre_ds_in_coll", "disable");
> MPI_Info_set(info, "romio_ds_read", "disable");
> MPI_Info_set(info, "romio_ds_write", "disable");
> Which results in the same error as before. Are there any other MPI options I can set?
> Thank you in advance for any advice,
> Dan Milroy
> users mailing list
For corporate legal information go to: http://www.cisco.com/web/about/doing_business/legal/cri/