Open MPI logo

Open MPI User's Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Open MPI User's mailing list

Subject: Re: [OMPI users] Problem with NFS + PVFS2 + OpenMPI
From: Robert Latham (robl_at_[hidden])
Date: 2008-05-29 15:41:56


On Thu, May 29, 2008 at 04:24:18PM -0300, Davi Vercillo C. Garcia wrote:
> Hi,
>
> I'm trying to run my program in my environment and some problems are
> happening. My environment is based on PVFS2 over NFS (PVFS is mounted
> over NFS partition), OpenMPI and Ubuntu. My program uses MPI-IO and
> BZ2 development libraries. When I tried to run, a message appears:
>
> File locking failed in ADIOI_Set_lock. If the file system is NFS, you
> need to use NFS version 3, ensure that the lockd daemon is running on
> all the machines, and mount the directory with the 'noac' option (no
> attribute caching).
> [campogrande05.dcc.ufrj.br:05005] MPI_ABORT invoked on rank 0 in
> communicator MPI_COMM_WORLD with errorcode 1
> mpiexec noticed that job rank 1 with PID 5008 on node campogrande04
> exited on signal 15 (Terminated).

Hi.

NFS has some pretty sloppy consistency semantics. If you want
parallel I/O to NFS you have to turn off some caches (the 'noac'
option in your error message) and work pretty hard to flush
client-side caches (which ROMIO does for you using fcntl locks). If
you do this, note that your performance will be really bad, but you'll
get correct results.

Your nfs-exported PVFS volumes will give you pretty decent serial i/o
performance since NFS caching only helps in that case.

I'd suggest, though, that you try using straight PVFS for your MPI-IO
application, as long as the parallell clients have access to all of
the pvfs servers (if tools like pvfs2-ping and pvfs2-ls work, then you
do). You'll get better performance for a variety of reasons and can
continue to keep your NFS-exported PVFS volumes up at the same time.

Oh, I see you want to use ordered i/o in your application. PVFS
doesn't support that mode. However, since you know how much data each
process wants to write, a combination of MPI_Scan (to compute each
processes offset) and MPI_File_write_at_all (to carry out the
collective i/o) will give you the same result with likely better
performance (and has the nice side effect of working with pvfs).

==rob

-- 
Rob Latham
Mathematics and Computer Science Division    A215 0178 EA2D B059 8CDF
Argonne National Lab, IL USA                 B29D F333 664A 4280 315B