Open MPI logo

Open MPI Development Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Development mailing list

Subject: [OMPI devel] [PATCH] Re: Still having issues w/ opal_path_nfs and EPERM
From: Paul Hargrove (phhargrove_at_[hidden])
Date: 2014-02-09 19:36:01


I found the source of the problem, and a solution.

The following is r30612, in which Jeff thought he had fixed the problem:

--- opal/util/path.c (revision 30611)
+++ opal/util/path.c (revision 30612)
@@ -515,12 +515,17 @@
     } while (-1 == vfsrc && ESTALE == errno && (0 < --trials));
 #endif

- /* In case some error with the current filename, try the directory */
+ /* In case some error with the current filename, try the parent
+ directory */
     if (-1 == fsrc && -1 == vfsrc) {
         char * last_sep;

         OPAL_OUTPUT_VERBOSE((10, 0, "opal_path_nfs: stat(v)fs on file:%s
failed errno:%d directory:%s\n",
                              fname, errno, file));
+ if (EPERM == errno) {
+ free(file);
+ return false;
+ }

         last_sep = strrchr(file, OPAL_PATH_SEP[0]);
         /* Stop the search, when we have searched past root '/' */

The new code to handle EPERM isn't reachable in general because fsrc and
vfsrc aren't initialized and thus won't both be -1 unless both calls have
run and failed. However, on many/most systems only one of the two calls
exist. If I weren't passing --enable-debug to all my builds, I suspect I'd
have seen an used-uninitialized warning for the "if" (which is a product of
other recent fixes in this function, r30198 on trunk).

The following fixes the problem for me:

--- opal/util/path.c~ 2014-02-09 23:52:37.764571000 +0100
+++ opal/util/path.c 2014-02-09 23:53:21.800383233 +0100
@@ -471,7 +471,8 @@
 {
 #if !defined(__WINDOWS__)
     int i;
- int fsrc, vfsrc;
+ int fsrc = -1;
+ int vfsrc = -1;
     int trials;
     char * file = strdup (fname);
 #if defined(USE_STATFS)

-Paul

On Sat, Feb 8, 2014 at 10:00 AM, Ralph Castain <rhc_at_[hidden]> wrote:

> Sounds like it - I'll take a peek and see if I can spot it, otherwise will
> have to wait for Jeff next week
>
> On Feb 8, 2014, at 9:56 AM, Paul Hargrove <phhargrove_at_[hidden]> wrote:
>
> A test of Friday night's trunk tarball is failing in the same manner.
> So, the CMR isn't the issue - the problem was never (fully?) fixed in
> trunk.
>
> -Paul
>
>
> On Fri, Feb 7, 2014 at 9:06 PM, Paul Hargrove <phhargrove_at_[hidden]> wrote:
>
>> I tested the 1.7 tarball tonight.
>> Jeff had indicated (
>> http://www.open-mpi.org/community/lists/devel/2014/01/13785.php) that
>> the problem I had reported w/ opal_path_nfs() and EPERM had been fixed in
>> the trunk.
>> Trac ticket #4125 indicated the fix was CMRed to v1.7
>>
>> However, I still see the problem:
>> Failure : Mismatch: input "/users/course13/.gvfs", expected:0 got:1
>>
>> Failure : Mismatch: input "/users/steineju/.gvfs", expected:0 got:1
>>
>> SUPPORT: OMPI Test failed: opal_path_nfs() (2 of 20 failed)
>> FAIL: opal_path_nfs
>>
>>
>> I don't currently know if the problem was every fixed on the trunk, but
>> should know by morning.
>>
>> -Paul
>>
>> --
>> Paul H. Hargrove PHHargrove_at_[hidden]
>> Future Technologies Group
>> Computer and Data Sciences Department Tel: +1-510-495-2352
>> Lawrence Berkeley National Laboratory Fax: +1-510-486-6900
>>
>
>
>
> --
> Paul H. Hargrove PHHargrove_at_[hidden]
> Future Technologies Group
> Computer and Data Sciences Department Tel: +1-510-495-2352
> Lawrence Berkeley National Laboratory Fax: +1-510-486-6900
> _______________________________________________
> devel mailing list
> devel_at_[hidden]
> http://www.open-mpi.org/mailman/listinfo.cgi/devel
>
>
>
> _______________________________________________
> devel mailing list
> devel_at_[hidden]
> http://www.open-mpi.org/mailman/listinfo.cgi/devel
>

-- 
Paul H. Hargrove                          PHHargrove_at_[hidden]
Future Technologies Group
Computer and Data Sciences Department     Tel: +1-510-495-2352
Lawrence Berkeley National Laboratory     Fax: +1-510-486-6900