Open MPI logo

Open MPI User's Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Open MPI User's mailing list

Subject: Re: [OMPI users] File locking in ADIO, OpenMPI 1.6.4
From: Sasso, John (GE Power & Water, Non-GE) (John1.Sasso_at_[hidden])
Date: 2014-04-16 14:14:28


Dan,

On the hosts where the ADIOI lock error occurs, are there any NFS errors in /var/log/messages, dmesg, or similar that refer to lockd?

--john

-----Original Message-----
From: users [mailto:users-bounces_at_[hidden]] On Behalf Of Daniel Milroy
Sent: Tuesday, April 15, 2014 10:55 AM
To: Open MPI Users
Subject: Re: [OMPI users] File locking in ADIO, OpenMPI 1.6.4

Hi Rob,

The applications of the two users in question are different; I haven¹t looked through much of either code. I can respond to your highlighted situations in sequence:

>- everywhere in NFS. If you have a Lustre file system exported to some
>clients as NFS, you'll get NFS (er, that might not be true unless you
>pick up a recent patch)
The compute nodes are Lustre clients mounting the file system via IB.

>- note: you don't need to disable data sieving for reads, though you
>might want to if the data sieving algorithm is wasting a lot of data.
That¹s good to know, though given the applications I can¹t say whether data sieving is wasting data.

>- if atomic mode was set on the file (i.e. you called
>MPI_File_set_atomicity)
>- if you use any of the shared file pointer operations
>- if you use any of the ordered mode collective operations
I don¹t know but will pass these questions on to the users.

Thank you,

Dan Milroy

On 4/14/14, 2:23 PM, "Rob Latham" <robl_at_[hidden]> wrote:

>
>
>On 04/08/2014 05:49 PM, Daniel Milroy wrote:
>> Hello,
>>
>> The file system in question is indeed Lustre, and mounting with flock
>> isn¹t possible in our environment. I recommended the following
>> changes to the users¹ code:
>
>Hi. I'm the ROMIO guy, though I do rely on the community to help me
>keep the lustre driver up to snuff.
>
>> MPI_Info_set(info, "collective_buffering", "true");
>> MPI_Info_set(info, "romio_lustre_ds_in_coll", "disable");
>> MPI_Info_set(info, "romio_ds_read", "disable"); MPI_Info_set(info,
>> "romio_ds_write", "disable");
>>
>> Which results in the same error as before. Are there any other MPI
>> options I can set?
>
>I'd like to hear more about the workload generating these lock
>messages, but I can tell you the situations in which ADIOI_SetLock gets called:
>- everywhere in NFS. If you have a Lustre file system exported to some
>clients as NFS, you'll get NFS (er, that might not be true unless you
>pick up a recent patch)
>- when writing a non-contiguous region in file, unless you disable data
>sieving, as you did above.
>- note: you don't need to disable data sieving for reads, though you
>might want to if the data sieving algorithm is wasting a lot of data.
>- if atomic mode was set on the file (i.e. you called
>MPI_File_set_atomicity)
>- if you use any of the shared file pointer operations
>- if you use any of the ordered mode collective operations
>
>you've turned off data sieving writes, which is what I would have first
>guessed would trigger this lock message. So I guess you are hitting
>one of the other cases.
>
>==rob
>
>--
>Rob Latham
>Mathematics and Computer Science Division Argonne National Lab, IL USA
>_______________________________________________
>users mailing list
>users_at_[hidden]
>http://www.open-mpi.org/mailman/listinfo.cgi/users

_______________________________________________
users mailing list
users_at_[hidden]
http://www.open-mpi.org/mailman/listinfo.cgi/users