Open MPI logo

Open MPI Development Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Development mailing list

Subject: Re: [OMPI devel] 1.6.2rc2: opal_path_nfs failure for "bind" mount
From: Jeff Squyres (jsquyres_at_[hidden])
Date: 2012-09-12 10:38:59


I just updated the test to check and see if we get a "none" type of filesystem. If so, we just skip it in the test.

On Sep 11, 2012, at 3:50 PM, Paul Hargrove wrote:

> I am NOT running on BG/Q.
> I am just building for Linux/PPC64 on its front-end node which has very recent XLC versions installed.
>
> I did look quickly just now at the opal_path_nfs.c test code and see that get_mounts() will require non-trivial work to process bind-mounts. The work is "just a matter of coding", but is beyond the scope of what I can contribute right now. I can test as needed, though anybody w/ root on a Linux box and an NFS filesystem should be able to reproduce the problem,
>
> -Paul [who probably could have avoided confusion by not mentioning BG/Q in the first place]
>
>
> On Tue, Sep 11, 2012 at 12:40 PM, Ralph Castain <rhc_at_[hidden]> wrote:
> Interesting - I can certainly fix the test so it lets make check complete.
>
> FWIW: I didn't know we were running on BG/Q - does it work? I assume this is with slurm?
>
> On Sep 11, 2012, at 12:34 PM, Paul Hargrove <phhargrove_at_[hidden]> wrote:
>
>> In testing 1.6.2rc2 on a BG/Q front-end I've encountered the following failure from "make check":
>>
>> Failure : Mismatch: input "/soft", expected:0 got:1
>> SUPPORT: OMPI Test failed: opal_path_nfs() (1 of 20 failed)
>> FAIL: opal_path_nfs
>>
>> What I find digging deeper is that the mount of /soft is a bit unusual (at least to me):
>>
>> $ grep /soft /etc/fstab
>> /gpfs/vesta_scratch/software /soft none _netdev,bind 0 0
>> $ mount | grep /soft
>> /gpfs/vesta_scratch/software on /soft type none (rw,bind,_netdev)
>> $ grep /soft /proc/mounts
>> /dev/vesta_scratch /soft gpfs rw,relatime 0 0
>>
>>
>> Looking into the mount man page I find that the "_netdev" is NOT a problem, just an keyword used to identify mounts that require network access to implement " -O no_netdev" in mount.
>> The "problem" that opal_path_nfs is encountering is that this is a "bind" mount which makes an already mounted fs (or subtree of one) available at a second location.
>>
>> If I am understanding "expected:0 got:1" correctly this failure shows that the TEST is getting this case (bind-mount of GPFS fs) incorrect.
>> So, this is a BENIGN failure, but distracting (and preventing "make check" from completing).
>>
>> -Paul
>>
>> --
>> Paul H. Hargrove PHHargrove_at_[hidden]
>> Future Technologies Group
>> Computer and Data Sciences Department Tel: +1-510-495-2352
>> Lawrence Berkeley National Laboratory Fax: +1-510-486-6900
>>
>> _______________________________________________
>> devel mailing list
>> devel_at_[hidden]
>> http://www.open-mpi.org/mailman/listinfo.cgi/devel
>
>
> _______________________________________________
> devel mailing list
> devel_at_[hidden]
> http://www.open-mpi.org/mailman/listinfo.cgi/devel
>
>
>
> --
> Paul H. Hargrove PHHargrove_at_[hidden]
> Future Technologies Group
> Computer and Data Sciences Department Tel: +1-510-495-2352
> Lawrence Berkeley National Laboratory Fax: +1-510-486-6900
>
> _______________________________________________
> devel mailing list
> devel_at_[hidden]
> http://www.open-mpi.org/mailman/listinfo.cgi/devel

-- 
Jeff Squyres
jsquyres_at_[hidden]
For corporate legal information go to: http://www.cisco.com/web/about/doing_business/legal/cri/