Open MPI logo

Open MPI Development Mailing List Archives

  |   Home   |   Support   |   FAQ   |  

This web mail archive is frozen.

This page is part of a frozen web archive of this mailing list.

You can still navigate around this archive, but know that no new mails have been added to it since July of 2016.

Click here to be taken to the new web archives of this list; it includes all the mails that are in this frozen archive plus all new mails that have been sent to the list since it was migrated to the new archives.

Subject: Re: [OMPI devel] known limitation or bug in hwloc?
From: Kenneth Lloyd (kenneth.lloyd_at_[hidden])
Date: 2011-08-29 09:05:31


Nadia,

Interesting. I haven't tried pushing this to levels above 8 on a particular
machine. Do you think that the cpuset / paffinity / hwloc only applies at
the machine level, at which time you need to employ a graph with carto?

Regards,

Ken

-----Original Message-----
From: devel-bounces_at_[hidden] [mailto:devel-bounces_at_[hidden]] On
Behalf Of nadia.derbey
Sent: Monday, August 29, 2011 5:45 AM
To: Open MPI Developers
Subject: [OMPI devel] known limitation or bug in hwloc?

Hi list,

I'm hitting a limitation with paffinity/hwloc with cpu numbers >= 64.

In opal/mca/paffinity/hwloc/paffinity_hwloc_module.c, module_set() is
the routine that sets the calling process affinity to the mask given as
parameter. Note that "mask" is a opal_paffinity_base_cpu_set_t (so we
allow the cpus to be potentially numbered up to
OPAL_PAFFINITY_BITMASK_CPU_MAX - 1).

The problem with module_set() is that is loops over
OPAL_PAFFINITY_BITMASK_T_NUM_BITS bits to check if these bits are set in
the mask:

for (i = 0; ((unsigned int) i) < OPAL_PAFFINITY_BITMASK_T_NUM_BITS; ++i)
{
        if (OPAL_PAFFINITY_CPU_ISSET(i, mask)) {
            hwloc_bitmap_set(set, i);
        }
    }

Given "mask"'s type, I think module_set() should instead loop over
OPAL_PAFFINITY_BITMASK_CPU_MAX bits.

Note that module_set() uses a type for its internal mask that is
coherent with OPAL_PAFFINITY_BITMASK_T_NUM_BITS (hwloc_bitmap_t).

So I'm wondering whether this is a known limitation I've never heard of
or an actual bug?

Regards,
Nadia

_______________________________________________
devel mailing list
devel_at_[hidden]
http://www.open-mpi.org/mailman/listinfo.cgi/devel
-----
No virus found in this message.
Checked by AVG - www.avg.com
Version: 10.0.1392 / Virus Database: 1520/3864 - Release Date: 08/28/11