Open MPI logo

Open MPI Development Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Development mailing list

Subject: Re: [OMPI devel] 1.4.5rc2 testing linux/ppc/IBM [SOLVED]
From: Paul H. Hargrove (PHHargrove_at_[hidden])
Date: 2012-01-27 15:18:38

On 1/27/2012 5:24 AM, Jeff Squyres wrote:
> On Jan 27, 2012, at 12:45 AM, Paul H. Hargrove wrote:
>> On this cluster, statfs() is returning ENOENT, which is breaking opal_path_nfs().
>> So, these results are with test/opal/util/opal_path_nfs.c "disabled".
> Paul -- can you explain this a little more? There should be logic in there to effectively handle ENOENT's, meaning that if we get a non-ESTALE error, we try again with the directory name. This is repeated until we get to "/" -- so there should definitely be at least one case where statfs() is *not* returning ENOENT.
> Is that not happening?

I looked a bit deeper and found that the bug is in OMPI, but a simple
one to fix.
I added 2 lines to opal/util/path.c:

--- openmpi-1.4.5rc2-orig/opal/util/path.c 2011-02-04
07:38:16.000000000 -0600
+++ openmpi-1.4.5rc2/opal/util/path.c 2012-01-27 12:46:30.000000000
@@ -476,6 +476,8 @@
          rc = statvfs (file, &buf);
  #elif defined(linux) || defined (__BSD) || (defined(__APPLE__) &&
          rc = statfs (file, &buf);
+ #error "No statvfs or statfs call"
      } while (-1 == rc && ESTALE == errno && (0 < --trials));

Can you guess what happens when I "make" now?
There IS no call to statfs, and the ENOENT I saw must have been "left
over" from some earlier libc call.

The problem is that these compilers have not pre-defined "linux".
It does appear that they are defining "__linux" and "__linux__"
So, a little change of the preprocessor logic should fix this problem:
    $ sed -pi -e 's/defined\(linux\)/defined\(__linux__\)/;' --
[more compact than the corresponding diffs]

With that change (and without "disabling" opal_path_nfs.c) all 4
compilers are PASSing "make all install check".

Source inspection suggests that the 1.5 branch has the same issue.
I've not inspected the HEAD, but somebody should.

I've done a bit of grepping for linux,__linux,__linux__.
My search shows only 2 files checking for definition of "linux"
And exactly one looking for "__linux":
Checks for "__linux__" appear in the following files:
    test/util/opal_path_nfs.c (IRONY!)
I suggest standardization to "__linux__" in the 3 files that currently
use "linux" or "__linux".


Paul H. Hargrove                          PHHargrove_at_[hidden]
Future Technologies Group
HPC Research Department                   Tel: +1-510-495-2352
Lawrence Berkeley National Laboratory     Fax: +1-510-486-6900