Open MPI logo

Open MPI Development Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Development mailing list

Subject: Re: [OMPI devel] 1.6.2rc2: opal_path_nfs failure for "bind" mount
From: Paul Hargrove (phhargrove_at_[hidden])
Date: 2012-09-12 16:45:44


Sounds like that should resolve my failure - I'll try to verify from a
nightly tarball when I have the opportunity.

The fix I had in mind would have been to parse the mounts with sufficient
intelligence to match a bind-mount to the original mount and determine it's
type.
I suppose that is still possible if one gets ambitious.

-Paul

On Wed, Sep 12, 2012 at 7:38 AM, Jeff Squyres <jsquyres_at_[hidden]> wrote:

> I just updated the test to check and see if we get a "none" type of
> filesystem. If so, we just skip it in the test.
>
>
> On Sep 11, 2012, at 3:50 PM, Paul Hargrove wrote:
>
> > I am NOT running on BG/Q.
> > I am just building for Linux/PPC64 on its front-end node which has very
> recent XLC versions installed.
> >
> > I did look quickly just now at the opal_path_nfs.c test code and see
> that get_mounts() will require non-trivial work to process bind-mounts.
> The work is "just a matter of coding", but is beyond the scope of what I
> can contribute right now. I can test as needed, though anybody w/ root on
> a Linux box and an NFS filesystem should be able to reproduce the problem,
> >
> > -Paul [who probably could have avoided confusion by not mentioning BG/Q
> in the first place]
> >
> >
> > On Tue, Sep 11, 2012 at 12:40 PM, Ralph Castain <rhc_at_[hidden]>
> wrote:
> > Interesting - I can certainly fix the test so it lets make check
> complete.
> >
> > FWIW: I didn't know we were running on BG/Q - does it work? I assume
> this is with slurm?
> >
> > On Sep 11, 2012, at 12:34 PM, Paul Hargrove <phhargrove_at_[hidden]> wrote:
> >
> >> In testing 1.6.2rc2 on a BG/Q front-end I've encountered the following
> failure from "make check":
> >>
> >> Failure : Mismatch: input "/soft", expected:0 got:1
> >> SUPPORT: OMPI Test failed: opal_path_nfs() (1 of 20 failed)
> >> FAIL: opal_path_nfs
> >>
> >> What I find digging deeper is that the mount of /soft is a bit unusual
> (at least to me):
> >>
> >> $ grep /soft /etc/fstab
> >> /gpfs/vesta_scratch/software /soft none _netdev,bind 0 0
> >> $ mount | grep /soft
> >> /gpfs/vesta_scratch/software on /soft type none (rw,bind,_netdev)
> >> $ grep /soft /proc/mounts
> >> /dev/vesta_scratch /soft gpfs rw,relatime 0 0
> >>
> >>
> >> Looking into the mount man page I find that the "_netdev" is NOT a
> problem, just an keyword used to identify mounts that require network
> access to implement " -O no_netdev" in mount.
> >> The "problem" that opal_path_nfs is encountering is that this is a
> "bind" mount which makes an already mounted fs (or subtree of one)
> available at a second location.
> >>
> >> If I am understanding "expected:0 got:1" correctly this failure shows
> that the TEST is getting this case (bind-mount of GPFS fs) incorrect.
> >> So, this is a BENIGN failure, but distracting (and preventing "make
> check" from completing).
> >>
> >> -Paul
> >>
> >> --
> >> Paul H. Hargrove PHHargrove_at_[hidden]
> >> Future Technologies Group
> >> Computer and Data Sciences Department Tel: +1-510-495-2352
> >> Lawrence Berkeley National Laboratory Fax: +1-510-486-6900
> >>
> >> _______________________________________________
> >> devel mailing list
> >> devel_at_[hidden]
> >> http://www.open-mpi.org/mailman/listinfo.cgi/devel
> >
> >
> > _______________________________________________
> > devel mailing list
> > devel_at_[hidden]
> > http://www.open-mpi.org/mailman/listinfo.cgi/devel
> >
> >
> >
> > --
> > Paul H. Hargrove PHHargrove_at_[hidden]
> > Future Technologies Group
> > Computer and Data Sciences Department Tel: +1-510-495-2352
> > Lawrence Berkeley National Laboratory Fax: +1-510-486-6900
> >
> > _______________________________________________
> > devel mailing list
> > devel_at_[hidden]
> > http://www.open-mpi.org/mailman/listinfo.cgi/devel
>
>
> --
> Jeff Squyres
> jsquyres_at_[hidden]
> For corporate legal information go to:
> http://www.cisco.com/web/about/doing_business/legal/cri/
>
>
> _______________________________________________
> devel mailing list
> devel_at_[hidden]
> http://www.open-mpi.org/mailman/listinfo.cgi/devel
>

-- 
Paul H. Hargrove                          PHHargrove_at_[hidden]
Future Technologies Group
Computer and Data Sciences Department     Tel: +1-510-495-2352
Lawrence Berkeley National Laboratory     Fax: +1-510-486-6900