Open MPI logo

Hardware Locality Development Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Hardware Locality Development mailing list

Subject: Re: [hwloc-devel] hwloc-1.7 issue roundup
From: Pavan Balaji (balaji_at_[hidden])
Date: 2013-05-07 10:59:20


Thanks Brice. What's the timeframe for 1.7.1? I want to see if we can
move to that for our mpich-3.1 release.

 -- Pavan

On 05/07/2013 09:53 AM US Central Time, Brice Goglin wrote:
> Pavan,
>
> I just pushed another round of commits to trunk, which hopefully address
> everything said earlier (except the LDTL problem waiting for Jeff's review).
>
> 1) autogen args
> https://svn.open-mpi.org/trac/hwloc/changeset/5586
> 2) sys/sysctl.h #ifdef fix
> https://svn.open-mpi.org/trac/hwloc/changeset/5587
> 3) sysctl/sysctlbyname check fix (include additional fix for icc)
> https://svn.open-mpi.org/trac/hwloc/changeset/5598
> 4) HWLOC_CHECK_DECL improvement
> https://svn.open-mpi.org/trac/hwloc/changeset/5599
> 5) strtoull check
> https://svn.open-mpi.org/trac/hwloc/changeset/5600
>
> (1) and (2) will be backported to v1.7 once regression testing is done.
> (5) should likely be OK too.
> (3) and more importantly (4) touch ugly configury, I am not sure I want
> to backport these. Or maybe later, in case we get more testing before
> v1.7.1 occurs.
>
> Brice
>
>
>
>
> Le 05/05/2013 18:18, Pavan Balaji a écrit :
>> All,
>>
>> Sorry for starting a new thread. I'm trying to round-up all the issues
>> I've reported for hwloc-1.7 so far into a more manageable format.
>>
>> 1. We had noticed errors with -D_POSIX_SOURCE that I had reported here:
>>
>> http://www.open-mpi.org/community/lists/hwloc-devel/2013/04/3649.php
>>
>> The error with POSIX_SOURCE itself was pretty straightforward. I was
>> able to fix it in the mpich version:
>>
>> http://git.mpich.org/mpich.git/commitdiff/255da3f6
>>
>> However, with our complete strict build flags, there were more errors.
>> Here's a summary and the relevant fixes:
>>
>> - hwloc's check for whether an explicitly function declaration is
>> needed (using _HWLOC_CHECK_DECL) was relying on whether a dummy call to
>> the function throws an error. This only works if the function
>> declaration is already present in one of the headers. If such a
>> declaration is not present, the test might fail with "implicit function
>> declaration" with the right CFLAGS. This leads the m4 macro to think
>> that the declaration is already there in one of the headers and an
>> additional declaration is not needed.
>>
>> The below commit fixes this by adding a dummy function declaration,
>> together with the dummy function definition:
>>
>> http://git.mpich.org/mpich.git/commitdiff/90da6e90
>>
>> FWIW, mpich's version of this macro also uses a similar dummy function
>> declaration together with the dummy call to the function:
>>
>> http://git.mpich.org/mpich.git/blob/HEAD:/confdb/aclocal_cc.m4#l1215
>>
>> - For sysctl and sysctlbyname, I've updated hwloc/config.m4 to use a
>> full link test instead of just using AC_CHECK_FUNCS, which only checks
>> to see if the symbol exists or not. For example, the prototype of
>> sysctl uses u_int, which on some platforms (such as FreeBSD) is only
>> defined under __BSD_VISIBLE, __USE_BSD or other similar definitions. So
>> while the symbols "sysctl" and "sysctlbyname" might still be available
>> in libc (which autoconf checks for), they might not be actually usable.
>>
>> The below commit fixes this:
>>
>> http://git.mpich.org/mpich.git/commitdiff/db276e4e
>>
>> - A minor error where strings.h was not included for strcasecmp.
>>
>> http://git.mpich.org/mpich.git/commitdiff/d2338c2d
>>
>> 2. I had reported an issue with libltdl in embedded mode (also in the
>> above thread). I believe Brice is looking into this, so I didn't
>> investigate it further. I'm using a disgusting, but workable, patch to
>> workaround this error in mpich (see the
>> src/pm/hydra/tools/topo/hwloc/hwloc/src/Makefile.am part of the below
>> patch):
>>
>> http://git.mpich.org/mpich.git/commitdiff/a3bce754
>>
>> I'd appreciate a cleaner fix to this issue.
>>
>> 3. I had reported an issue with the usage of getpagesize() instead of
>> hwloc_getpagesize() on the Mac.
>>
>> http://www.open-mpi.org/community/lists/hwloc-devel/2013/05/3662.php
>>
>> I believe Samuel has already incorporated this in hwloc trunk. Here is
>> the fix I used for your reference:
>>
>> http://git.mpich.org/mpich.git/commitdiff/d9a67f40
>>
>> 4. I had reported some warnings on the FreeBSD strict build here.
>>
>> http://www.open-mpi.org/community/lists/hwloc-devel/2013/05/3669.php
>>
>> I believe Brice and Samuel are looking into it, but I don't have a
>> confirmation on whether this is fixed. I didn't fix them in mpich yet.
>>
>> As you can tell, we are looking into upgrading to hwloc-1.7 for the next
>> major release of mpich (3.1). With the above fixes, it looks like
>> things are working well. Of course, we'll be going through a lot more
>> testing before the final release which would be later this year.
>>
>> -- Pavan
>>
>

-- 
Pavan Balaji
http://www.mcs.anl.gov/~balaji