Open MPI logo

Hardware Locality Development Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Hardware Locality Development mailing list

Subject: Re: [hwloc-devel] hwloc-1.7 issue roundup
From: Brice Goglin (Brice.Goglin_at_[hidden])
Date: 2013-05-07 11:07:01


We got several important bug reports (and fixes) in the last days. I
think we don't need anything new aside from the pending LTDL fix. So
let's say within 2 weeks.

Brice

Le 07/05/2013 16:59, Pavan Balaji a écrit :
> Thanks Brice. What's the timeframe for 1.7.1? I want to see if we can
> move to that for our mpich-3.1 release.
>
> -- Pavan
>
> On 05/07/2013 09:53 AM US Central Time, Brice Goglin wrote:
>> Pavan,
>>
>> I just pushed another round of commits to trunk, which hopefully address
>> everything said earlier (except the LDTL problem waiting for Jeff's review).
>>
>> 1) autogen args
>> https://svn.open-mpi.org/trac/hwloc/changeset/5586
>> 2) sys/sysctl.h #ifdef fix
>> https://svn.open-mpi.org/trac/hwloc/changeset/5587
>> 3) sysctl/sysctlbyname check fix (include additional fix for icc)
>> https://svn.open-mpi.org/trac/hwloc/changeset/5598
>> 4) HWLOC_CHECK_DECL improvement
>> https://svn.open-mpi.org/trac/hwloc/changeset/5599
>> 5) strtoull check
>> https://svn.open-mpi.org/trac/hwloc/changeset/5600
>>
>> (1) and (2) will be backported to v1.7 once regression testing is done.
>> (5) should likely be OK too.
>> (3) and more importantly (4) touch ugly configury, I am not sure I want
>> to backport these. Or maybe later, in case we get more testing before
>> v1.7.1 occurs.
>>
>> Brice
>>
>>
>>
>>
>> Le 05/05/2013 18:18, Pavan Balaji a écrit :
>>> All,
>>>
>>> Sorry for starting a new thread. I'm trying to round-up all the issues
>>> I've reported for hwloc-1.7 so far into a more manageable format.
>>>
>>> 1. We had noticed errors with -D_POSIX_SOURCE that I had reported here:
>>>
>>> http://www.open-mpi.org/community/lists/hwloc-devel/2013/04/3649.php
>>>
>>> The error with POSIX_SOURCE itself was pretty straightforward. I was
>>> able to fix it in the mpich version:
>>>
>>> http://git.mpich.org/mpich.git/commitdiff/255da3f6
>>>
>>> However, with our complete strict build flags, there were more errors.
>>> Here's a summary and the relevant fixes:
>>>
>>> - hwloc's check for whether an explicitly function declaration is
>>> needed (using _HWLOC_CHECK_DECL) was relying on whether a dummy call to
>>> the function throws an error. This only works if the function
>>> declaration is already present in one of the headers. If such a
>>> declaration is not present, the test might fail with "implicit function
>>> declaration" with the right CFLAGS. This leads the m4 macro to think
>>> that the declaration is already there in one of the headers and an
>>> additional declaration is not needed.
>>>
>>> The below commit fixes this by adding a dummy function declaration,
>>> together with the dummy function definition:
>>>
>>> http://git.mpich.org/mpich.git/commitdiff/90da6e90
>>>
>>> FWIW, mpich's version of this macro also uses a similar dummy function
>>> declaration together with the dummy call to the function:
>>>
>>> http://git.mpich.org/mpich.git/blob/HEAD:/confdb/aclocal_cc.m4#l1215
>>>
>>> - For sysctl and sysctlbyname, I've updated hwloc/config.m4 to use a
>>> full link test instead of just using AC_CHECK_FUNCS, which only checks
>>> to see if the symbol exists or not. For example, the prototype of
>>> sysctl uses u_int, which on some platforms (such as FreeBSD) is only
>>> defined under __BSD_VISIBLE, __USE_BSD or other similar definitions. So
>>> while the symbols "sysctl" and "sysctlbyname" might still be available
>>> in libc (which autoconf checks for), they might not be actually usable.
>>>
>>> The below commit fixes this:
>>>
>>> http://git.mpich.org/mpich.git/commitdiff/db276e4e
>>>
>>> - A minor error where strings.h was not included for strcasecmp.
>>>
>>> http://git.mpich.org/mpich.git/commitdiff/d2338c2d
>>>
>>> 2. I had reported an issue with libltdl in embedded mode (also in the
>>> above thread). I believe Brice is looking into this, so I didn't
>>> investigate it further. I'm using a disgusting, but workable, patch to
>>> workaround this error in mpich (see the
>>> src/pm/hydra/tools/topo/hwloc/hwloc/src/Makefile.am part of the below
>>> patch):
>>>
>>> http://git.mpich.org/mpich.git/commitdiff/a3bce754
>>>
>>> I'd appreciate a cleaner fix to this issue.
>>>
>>> 3. I had reported an issue with the usage of getpagesize() instead of
>>> hwloc_getpagesize() on the Mac.
>>>
>>> http://www.open-mpi.org/community/lists/hwloc-devel/2013/05/3662.php
>>>
>>> I believe Samuel has already incorporated this in hwloc trunk. Here is
>>> the fix I used for your reference:
>>>
>>> http://git.mpich.org/mpich.git/commitdiff/d9a67f40
>>>
>>> 4. I had reported some warnings on the FreeBSD strict build here.
>>>
>>> http://www.open-mpi.org/community/lists/hwloc-devel/2013/05/3669.php
>>>
>>> I believe Brice and Samuel are looking into it, but I don't have a
>>> confirmation on whether this is fixed. I didn't fix them in mpich yet.
>>>
>>> As you can tell, we are looking into upgrading to hwloc-1.7 for the next
>>> major release of mpich (3.1). With the above fixes, it looks like
>>> things are working well. Of course, we'll be going through a lot more
>>> testing before the final release which would be later this year.
>>>
>>> -- Pavan
>>>