Open MPI logo

Open MPI Development Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Development mailing list

Subject: [OMPI devel] 1.4.5rc2 opal_path_nfs failure follow-up
From: Paul H. Hargrove (PHHargrove_at_[hidden])
Date: 2012-01-30 05:14:51


On 1/29/2012 9:28 PM, Paul H. Hargrove wrote:
[snip]
>
> I also had to disable the opal_path_nfs test again on the POWER6
> machine, even w/ the linux->__linux__ change.
> I will report on that when/if I can determine the cause.
>
> -Paul
>

Following up on the opal_path_nfs test failures I reported when testing
xlc-[789].0:

I added tracing to print errno information for any failing statfs()
calls in opal/util/path.c.
The following is what then see from "make check -C test/util":
> Making check in util
> make[2]: Entering directory `/home/hargrove/openmpi-1.4.5rc2/test/util'
> make opal_path_nfs
> make[3]: Entering directory `/home/hargrove/openmpi-1.4.5rc2/test/util'
> /bin/sh ../../libtool --tag=CC --mode=link gcc -O3 -DNDEBUG
> -finline-functions -fno-strict-aliasing -pthread -export-dynamic -o
> opal_path_nfs opal_path_nfs.o ../../opal/libopen-pal.la
> ../../test/support/libsupport.a -lnsl -lutil -lm
> libtool: link: gcc -O3 -DNDEBUG -finline-functions
> -fno-strict-aliasing -pthread -o .libs/opal_path_nfs opal_path_nfs.o
> -Wl,--export-dynamic ../../opal/.libs/libopen-pal.so -ldl
> ../../test/support/libsupport.a -lnsl -lutil -lm -pthread -Wl,-rpath
> -Wl,/home/hargrove/openmpi-1.4.5rc2/INST/lib
> make[3]: Leaving directory `/home/hargrove/openmpi-1.4.5rc2/test/util'
> make check-TESTS
> make[3]: Entering directory `/home/hargrove/openmpi-1.4.5rc2/test/util'
> @ statfs("/projects") failed (14:Bad address)
> @ statfs("") failed (2:No such file or directory)
> Failure : Mismatch: input "/projects", expected:1 got:0
>
> SUPPORT: OMPI Test failed: opal_path_nfs() (1 of 16 failed)
> FAIL: opal_path_nfs
> ========================================================
> 1 of 1 test failed
> Please report to http://www.open-mpi.org/community/help/
> ========================================================

As you can see, the statfs() call for the "/projects" mount point failed
with errno=EFAULT.
I cannot think of any sane way to account for such an error return.

Use of strace confirms:
     statfs("/projects", 0xffffd8e0) = -1 EFAULT (Bad address)
while other mount points are fine, such as:
     statfs("/opt/shared", {f_type="NFS_SUPER_MAGIC", f_bsize=8192,
f_blocks=131068, f_bfree=50166, f_bavail=50166, f_files=0, f_ffree=0,
f_fsid={0, 0}, f_namelen=255, f_frsize=8192}) = 0

Not sure what one could expect opal_path_nfs() to do in the face of such
an odd failure.
If anybody has suggestions for follow-up, let me know.
And, yes, /projects *is* an NFS mount.

-Paul

-- 
Paul H. Hargrove                          PHHargrove_at_[hidden]
Future Technologies Group
HPC Research Department                   Tel: +1-510-495-2352
Lawrence Berkeley National Laboratory     Fax: +1-510-486-6900