Open MPI logo

Open MPI User's Mailing List Archives

  |   Home   |   Support   |   FAQ   |  

This web mail archive is frozen.

This page is part of a frozen web archive of this mailing list.

You can still navigate around this archive, but know that no new mails have been added to it since July of 2016.

Click here to be taken to the new web archives of this list; it includes all the mails that are in this frozen archive plus all new mails that have been sent to the list since it was migrated to the new archives.

Subject: Re: [OMPI users] File locking in ADIO, OpenMPI 1.6.4
From: Rob Latham (robl_at_[hidden])
Date: 2014-04-14 16:23:31


On 04/08/2014 05:49 PM, Daniel Milroy wrote:
> Hello,
>
> The file system in question is indeed Lustre, and mounting with flock
> isn’t possible in our environment. I recommended the following changes
> to the users’ code:

Hi. I'm the ROMIO guy, though I do rely on the community to help me
keep the lustre driver up to snuff.

> MPI_Info_set(info, "collective_buffering", "true");
> MPI_Info_set(info, "romio_lustre_ds_in_coll", "disable");
> MPI_Info_set(info, "romio_ds_read", "disable");
> MPI_Info_set(info, "romio_ds_write", "disable");
>
> Which results in the same error as before. Are there any other MPI
> options I can set?

I'd like to hear more about the workload generating these lock messages,
but I can tell you the situations in which ADIOI_SetLock gets called:
- everywhere in NFS. If you have a Lustre file system exported to some
clients as NFS, you'll get NFS (er, that might not be true unless you
pick up a recent patch)
- when writing a non-contiguous region in file, unless you disable data
sieving, as you did above.
- note: you don't need to disable data sieving for reads, though you
might want to if the data sieving algorithm is wasting a lot of data.
- if atomic mode was set on the file (i.e. you called
MPI_File_set_atomicity)
- if you use any of the shared file pointer operations
- if you use any of the ordered mode collective operations

you've turned off data sieving writes, which is what I would have first
guessed would trigger this lock message. So I guess you are hitting one
of the other cases.

==rob

-- 
Rob Latham
Mathematics and Computer Science Division
Argonne National Lab, IL USA